Tax dept leans on 'Big Data' to mark out multiple PAN holders
To plug tax loopholes, the I-T department will use Big Data analytics to track down evaders by collecting information such as common address, mobile number and e-mail to establish relationship between their multiple PANs.
The department, with support from private firms, will analyse the voluminous data available post demonetisation for checking relationships between PAN holders.
The Managed Service Provider (MSP), which the I-T department plans to hire, will design and operationalise analytical solution that will help collating data, matching it and identifying relationships as well as clustering of PAN and non-PAN data, an official said.
The analytical solution would help the department gather data received from banks, post offices and other sources for linking of information and identification of duplicate details.
It will also identify records with errors or other defects for resubmission. "The data quality errors and defects will be communicated to the reporting person or entities, say, banks or post offices for correction and improving data quality," the official added.
The data integration and matching of the PAN based demonetisation information with that of I-T databases such as tax returns, TDS, third-party reporting, tax payments, would be used to build a comprehensive profile for the taxpayer.
It will help identify link between PAN holders on the basis of relationships (business association, asset and transactional association) available in various databases, the official said, adding that the analytics will do clustering of PAN-linked demonetised data using identified relationships as well as common address, mobile number, e-mail and bank branch.
Also, it will cluster non-PAN demonetised data using common name, address, mobile number, e-mail and bank branch.
Taxpayer segmentation on the basis of taxpayers' status, type of ITR form used, nature of business, taxpayer segment, age of the individual and compliance history will also have to be prepared.
It will prioritise demonetisation data based on taxpayer segment, relationships, clusters, rules and risk matrix.
"Different types of interventions (send e-mail, SMS, outbound call, letter, notice, verification, investigation) can be selected for taxpayer priority and segment," the official added.