Scope of work - To convert PDF files by OCRing into editable and searchable data and renaming of pdf file.
Task performed - PDF Files in bulk were provided to us via secure FTP. We ensured the download of these files from the client's server to our development server which is secure by itself. The files were supplied in batches and grouped under different folders. We used proprietary OCR software to convert the PDF (input) files to editable and searchable data (Output) files. Once the files were completely OCRed, the names of the files were renamed based on the details contained inside the OCRed document. On completion of each batch the files were randomly checked for any mistakes. If a mistake was found the entire batch was redone from scratch. On completion of a batch it was securely uploaded back to the client's server.
Result - The challenge was to wait till the client uploaded the batches of 200.000 files to their FTP and then download it securely at our server. Ensured that the download was successful and assigned the folders to the team of ten coders, got the work done back from them, QAed them and finally updated to the client. Another big challenge was to keep the team work in shifts in order to achieve a short turnaround. The work was completed in two weeks and the client had a saving of 30-40% on the total project.