• Home
  • PIXEL DREAMS DATA SYNTH
    • About
    • How It Works
    • File formats
    • Select Package
    • Contact Info
  • More about us
    • About Parent Company
    • Terms and Conditions
    • Privacy Policy
    • Return and Refund Policy
  • More
    • Home
    • PIXEL DREAMS DATA SYNTH
      • About
      • How It Works
      • File formats
      • Select Package
      • Contact Info
    • More about us
      • About Parent Company
      • Terms and Conditions
      • Privacy Policy
      • Return and Refund Policy
  • Home
  • PIXEL DREAMS DATA SYNTH
    • About
    • How It Works
    • File formats
    • Select Package
    • Contact Info
  • More about us
    • About Parent Company
    • Terms and Conditions
    • Privacy Policy
    • Return and Refund Policy

supported and unsupported file formats

Supported File Formats

Formats We Cannot Process

Formats We Cannot Process

 

Text Documents

  • PDF (.pdf)
  • Microsoft Word (.doc, .docx)
  • Text files (.txt)
  • Rich Text Format (.rtf)
  • Markdown (.md)
  • OpenDocument Text (.odt)
  • HTML (.html, .htm)


Spreadsheets

  • Microsoft Excel (.xls, .xlsx)
  • CSV (Comma Separated Values) (.csv)
  • TSV (Tab Separated Values) (.tsv)
  • OpenDocument Spreadsheet (.ods)


Presentations

  • Microsoft PowerPoint (.ppt, .pptx)
  • OpenDocument Presentation (.odp)


Data Formats

  • JSON (.json)
  • XML (.xml)
  • YAML (.yaml, .yml)


Email

  • Outlook Messages (.msg)
  • Email Archives (.eml)


Image-Based Documents (with OCR processing)

  • Scanned PDFs
  • Image files containing text (.jpg, .png)


While we can process image-based documents, the accuracy of text extraction depends on image quality.

This format support would require implementing appropriate libraries for text extraction, such as:

  • pdf.js for PDFs
  • mammoth.js for Word documents
  • SheetJS for Excel files
  • An OCR service like Tesseract.js for image-based documents


Size Limitations

  • Maximum file size: 25MB per document
  • Maximum total project size: 100MB
  • Maximum recommended pages: 50 pages of text or 10 spreadsheets per project


Formats We Cannot Process

Formats We Cannot Process

Formats We Cannot Process

 

Image-Based Documents Without Quality OCR

  • Scanned documents with poor quality or resolution
  • Handwritten documents
  • Faxed documents with artifacts
  • Documents with watermarks that obscure text
  • Images with text embedded in complex backgrounds
  • Documents with non-standard fonts or stylized text


Protected/Secured Documents

  • Password-protected PDFs
  • DRM-protected documents
  • Encrypted files of any format
  • Documents with editing restrictions


Complex Formats

  • PDF forms with fillable fields (content may not extract properly)
  • Documents with complex layouts (multiple columns, text boxes, etc.)
  • PDFs created as image compilations without OCR
  • Scanned books with curved page surfaces


Specialized Formats

  • CAD files (.dwg, .dxf)
  • GIS/Mapping files
  • Proprietary financial software exports
  • Database files (.mdb, .accdb)
  • Compressed archives (.zip, .rar) - would need to be extracted first


Media Files

  • Audio files (.mp3, .wav)
  • Video files (.mp4, .mov)
  • Pure image files without text content


Size/Complexity Issues

  • Documents exceeding 100 pages
  • Spreadsheets with more than 10,000 rows
  • Files larger than 25MB
  • Documents with hundreds of embedded images

Select a package that fits your needs and start uploading your files today

Packages

Copyright © 2025 Small Time Investment - All Rights Reserved.


Privacy Policy        Return and Refund Policy        Terms and Conditions


Pixel Dreams AI Data Synthesis

This website uses cookies.

We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.

Accept