The XML schema is the backbone of DAC8 reporting. Every CASP must generate reports in a specific XML format that conforms to the DAC8/CARF standard. Understanding the schema's structure, elements, and validation rules is essential for building compliant reporting systems.
Overview of the XML Schema
The DAC8 XML reporting schema is derived from the OECD's CARF XML schema, with adaptations to reflect the EU's specific requirements under the Directive on Administrative Cooperation. The schema defines the structure, data types, and validation rules for the electronic reports that CASPs must submit to their home Member State's tax authority.
The schema uses standard XML technologies including XML Schema Definition (XSD) for structural validation, ISO standards for country codes (ISO 3166), currency codes (ISO 4217), and date formats (ISO 8601), and UTF-8 encoding for all text fields.
Top-Level Structure
A DAC8 XML report consists of several hierarchical elements:
MessageSpec. The message header containing metadata about the report, including the sending competent authority identifier, the receiving competent authority identifier, the message type (new report, correction, or deletion), a unique message reference ID, and the reporting period (calendar year).
CARFBody. The main body of the report, containing one or more ReportingGroup elements. Each ReportingGroup represents the reports from a single Reporting CASP and contains the CASP's identification information and the individual account reports.
ReportingCASP. Identification of the reporting entity, including the legal name, address, jurisdiction of tax residence, TIN, and the CASP's unique identifier (such as MiCA authorization number).
AccountReport. Individual reports for each reportable user, containing the user's identification information, transaction data, and account details.
User Identification Elements
For each reportable user, the schema requires the following identification elements:
Individual users. Name (first name and last name, using the OECD's NameFix type), address (using the structured address format with street, building, suite, floor, city, postal code, and country code), TIN (with the jurisdiction that issued it), date of birth, place of birth (when available), and the self-certification status.
Entity users. Legal name, address, TIN, entity type classification, and for each controlling person, the same identification elements as for individual users.
Transaction Data Elements
The schema structures transaction data into several categories:
Payment type. Each transaction must be classified as one of the following types: CRS501 (exchange of crypto-assets for fiat currency), CRS502 (exchange of crypto-assets for other crypto-assets), CRS503 (transfer of crypto-assets), or CRS504 (retail payment transactions using crypto-assets).
Aggregated amounts. For each payment type and each crypto-asset, the report must include the aggregate gross amount in the reporting currency, the number of units of the crypto-asset transacted, the aggregate fair market value in the reporting currency, and the total number of transactions.
Crypto-asset identification. Each crypto-asset must be identified by its name, its distributed ledger identifier (such as the contract address for tokens), and its classification under MiCA (asset-referenced token, e-money token, or other crypto-asset).
Correction and Deletion Mechanisms
The schema supports three types of messages:
New data (CARF1). The initial report for a reporting period. This is the standard annual submission.
Corrected data (CARF2). Corrections to previously submitted data. Corrections must reference the original message and the specific elements being corrected using the DocRefId system.
Deleted data (CARF3). Deletion of previously submitted data that should not have been reported. Deletions must also reference the original elements using DocRefId.
Each AccountReport contains a unique DocRefId that enables the correction and deletion mechanism. CASPs must maintain a record of all DocRefIds generated to support subsequent corrections.
Validation Rules
Tax authorities apply validation rules to incoming DAC8 reports, including schema validation (conformance to the XSD), business rules (logical consistency checks such as dates, amounts, and TIN formats), TIN validation (format verification against known TIN patterns for each jurisdiction), and cross-referencing (checking for duplicate reports and inconsistent data across reporting periods).
Reports that fail validation may be rejected, requiring the CASP to correct and resubmit. CASPs should implement their own pre-submission validation to minimize rejection rates.
Encoding and Character Sets
All DAC8 XML reports must use UTF-8 encoding. Special characters in names and addresses must be properly encoded using XML entities or CDATA sections. CASPs serving international user bases must ensure that their systems can correctly handle and transmit names and addresses in various scripts and character sets.
Practical Implementation
Building a DAC8 XML generation system requires understanding the full XSD schema and all its dependencies, mapping internal data structures to the schema's elements, implementing validation logic that mirrors the receiving authority's checks, handling corrections and deletions with proper DocRefId management, testing against sample data and validation tools provided by tax authorities, and establishing a production workflow for annual report generation and submission.
Conclusion
The DAC8 XML schema is technically demanding but well-structured. CASPs that invest in understanding the schema's requirements and building robust generation and validation systems will be well-positioned to meet their reporting obligations accurately and efficiently. Testing against the schema should begin well before the first reporting deadline to identify and resolve any data quality or formatting issues.
Preparing for DAC8?
Our team helps CASPs with gap analysis, transposition tracking, TIN validation, and XML report generation.