Unable to parse pdf document using Tika Parser in AEM6.0
Hi Team,
I am unable to parse or read the text of the pdf file using Tika parser.
Asset asset = DamUtil.resolveToAsset(dataResource);
Resource original = asset.getOriginal();
InputStream is = original.adaptTo(InputStream.class);
ContentHandler handler = new BodyContentHandler(10 * 1024 * 1024);
Metadata metadata = new Metadata();
AutoDetectParser parser = new AutoDetectParser();
ParseContext context = new ParseContext();
try {
context.set(AutoDetectParser.class, parser);
parser.parse(is, handler, metadata, context);
is.close();
} catch (Exception e) {
throw new Exception("Error parsing file"+asset.getPath(), e);
}
Getting Tika parse exception
Please help me resolve this issue or share the link where I can go through.
Thank a lot