We have been testing our eDiscovery file types and International language options on the EDRM Data Set (http://www.edrm.net/projects/dataset) and select customer jobs that have had problems with other eDiscovery methods. In the process we have compiled a list of file types from iPro (see) below that have been successfully processed using our technology.
We are also testing Lexis Nexis Law pre-Discovery modules, which currently covers 4,145 file types. To view those file types, click here:
Law pre-Discovery has a module add in for advanced OCR which handles converting Asian characters to 16 bit unicode text.
Most eDiscovery software vendors support conversion of English and European languages into searchable PDF’s, we have found that many cannot properly OCR Asian character sets.
So, we developed some open source software that converts Asian character sets into searchable .PDF’s. Contact us to get a copy of the free program.
Here is the compilation of the file types that have been tested by the iPro eCapture software that we are testing that includes over 750 file types: