Lectura de archivos PDF como una cadena a través de la aplicación de iPhone

Tengo un problema con el desarrollo de aplicaciones de iPhone para "Leer PDF". He intentado seguir el código. Sé que he usado métodos incorrectos para analizar: los métodos de análisis se utilizan para fines de búsqueda. Pero quiero convertir todo el texto en PDF en una cadena. Digamos, por ejemplo, MobileHIG.pdf de Apple: lo he usado en este código.Lectura de archivos PDF como una cadena a través de la aplicación de iPhone

@implementation NetPDFViewController 

size_t totalPages; // a variable to store total pages 

// a method to get the pdf ref 
CGPDFDocumentRef MyGetPDFDocumentRef (const char *filename) { 
    CFStringRef path; 
    CFURLRef url; 
    CGPDFDocumentRef document; 
    path = CFStringCreateWithCString (NULL, filename,kCFStringEncodingUTF8); 
    url = CFURLCreateWithFileSystemPath (NULL, path, kCFURLPOSIXPathStyle, 0); 
    CFRelease (path); 
    document = CGPDFDocumentCreateWithURL (url);// 2 
    CFRelease(url); 
    int count = CGPDFDocumentGetNumberOfPages (document);// 3 
    if (count == 0) { 
     printf("`%s' needs at least one page!", filename); 
     return NULL; 
    } 
    return document; 
} 

// table methods to parse pdf 
static void op_MP (CGPDFScannerRef s, void *info) { 
    const char *name; 
    if (!CGPDFScannerPopName(s, &name)) 
     return; 
    printf("MP /%s\n", name); 
} 

static void op_DP (CGPDFScannerRef s, void *info) { 
    const char *name; 
    if (!CGPDFScannerPopName(s, &name)) 
     return; 
    printf("DP /%s\n", name); 
} 

static void op_BMC (CGPDFScannerRef s, void *info) { 
    const char *name; 
    if (!CGPDFScannerPopName(s, &name)) 
     return; 
    printf("BMC /%s\n", name); 
} 

static void op_BDC (CGPDFScannerRef s, void *info) { 
    const char *name; 
    if (!CGPDFScannerPopName(s, &name)) 
     return; 
    printf("BDC /%s\n", name); 
} 

static void op_EMC (CGPDFScannerRef s, void *info) { 
    const char *name; 
    if (!CGPDFScannerPopName(s, &name)) 
     return; 
    printf("EMC /%s\n", name); 
} 

// a method to display pdf page. 

void MyDisplayPDFPage (CGContextRef myContext,size_t pageNumber,const char *filename) { 
    CGPDFDocumentRef document; 
    CGPDFPageRef page; 
    document = MyGetPDFDocumentRef (filename);// 1 
    totalPages=CGPDFDocumentGetNumberOfPages(document); 
    page = CGPDFDocumentGetPage (document, pageNumber);// 2 

    CGPDFDictionaryRef d; 

    d = CGPDFPageGetDictionary(page); 

// ----- edit problem here - CGPDFDictionary is completely unknown 
// ----- as we don't know keys & values of it. 
    CGPDFScannerRef myScanner; 
    CGPDFOperatorTableRef myTable; 
    myTable = CGPDFOperatorTableCreate(); 
    CGPDFOperatorTableSetCallback (myTable, "MP", &op_MP); 
    CGPDFOperatorTableSetCallback (myTable, "DP", &op_DP); 
    CGPDFOperatorTableSetCallback (myTable, "BMC", &op_BMC); 
    CGPDFOperatorTableSetCallback (myTable, "BDC", &op_BDC); 
    CGPDFOperatorTableSetCallback (myTable, "EMC", &op_EMC); 

    CGPDFContentStreamRef myContentStream = CGPDFContentStreamCreateWithPage (page);// 3 
    myScanner = CGPDFScannerCreate (myContentStream, myTable, NULL);// 4 

    CGPDFScannerScan (myScanner);// 5 

// CGPDFDictionaryRef d; 

    CGPDFStringRef str; // represents a sequence of bytes 

    d = CGPDFPageGetDictionary(page); 

    if (CGPDFDictionaryGetString(d, "Thumb", &str)){ 
     CFStringRef s; 
     s = CGPDFStringCopyTextString(str); 
     if (s != NULL) { 
      //need something in here in case it cant find anything 
      NSLog(@"%@ testing it", s); 
     } 
     CFRelease(s);  
//  CFDataRef data = CGPDFStreamCopyData (stream, CGPDFDataFormatRaw); 
    } 

// ----------------------------------- 

    CGContextDrawPDFPage (myContext, page);// 3 
    CGContextTranslateCTM(myContext, 0, 20); 
    CGContextScaleCTM(myContext, 1.0, -1.0); 
    CGPDFDocumentRelease (document);// 4 
} 

- (void)viewDidLoad { 
    [super viewDidLoad]; 


// -------------------------------------------------------- 
// code for simple direct image from pdf docs. 
    UIGraphicsBeginImageContext(CGSizeMake(320, 460)); 
    initialPage=28; 
    MyDisplayPDFPage(UIGraphicsGetCurrentContext(), initialPage, [[[NSBundle mainBundle] pathForResource:@"MobileHIG" ofType:@"pdf"] UTF8String]); 
    imgV.image=UIGraphicsGetImageFromCurrentImageContext(); 
    imgV.image=[imgV.image rotate:UIImageOrientationDownMirrored]; 
} 

- (void)touchesBegan:(NSSet *)touches withEvent:(UIEvent *)event{ 
    UITouch *touch = [touches anyObject]; 
    CGPoint LasttouchPoint = [touch locationInView:self.view]; 
    int LasttouchX = LasttouchPoint.x; 
    startpoint=LasttouchX; 
} 


- (void)touchesMoved:(NSSet *)touches withEvent:(UIEvent *)event{ 

} 

- (void)touchesEnded:(NSSet *)touches withEvent:(UIEvent *)event{ 
    UITouch *touch = [touches anyObject]; 
    CGPoint LasttouchPoint = [touch locationInView:self.view]; 
    int LasttouchX = LasttouchPoint.x; 
    endpoint=LasttouchX; 
    if(startpoint>(endpoint+75)){ 
     initialPage++; 
     [self loadPage:initialPage nextOne:YES]; 
    } else if((startpoint+75)<endpoint){ 
     initialPage--; 
     [self loadPage:initialPage nextOne:NO]; 
    } 
} 


-(void)loadPage:(NSUInteger)page nextOne:(BOOL)yesOrNo{ 
    if(page<=totalPages && page>0){ 
     UIGraphicsBeginImageContext(CGSizeMake(720, 720)); 
     MyDisplayPDFPage(UIGraphicsGetCurrentContext(), page, [[[NSBundle mainBundle] pathForResource:@"MobileHIG" ofType:@"pdf"] UTF8String]); 

     CATransition *transition = [CATransition animation]; 
     transition.duration = 0.75; 
     transition.timingFunction = [CAMediaTimingFunction functionWithName:kCAMediaTimingFunctionEaseInEaseOut]; 
     transition.type=kCATransitionPush; 
     if(yesOrNo){ 
      transition.subtype=kCATransitionFromRight; 
     } else { 
      transition.subtype=kCATransitionFromLeft; 
     } 

     transition.delegate = self; 
     [imgV.layer addAnimation:transition forKey:nil]; 
     imgV.image=UIGraphicsGetImageFromCurrentImageContext(); 
     imgV.image=[imgV.image rotate:UIImageOrientationDownMirrored]; 
    } 
}

Pero no tuve éxito en leer ni una sola línea del documento pdf. ¿Qué falta todavía?

Fuente

2010-03-02 Sagar R. Kothari

Vea este enlace http://www.iphonedevsdk.com/forum/iphone-sdk-development/29770-pdf-title- keywords-label.html - tiene detalles que leen el archivo pdf y extrae una cadena del mismo. Link ha proporcionado detalles de: extracción de la tabla de contenido –

Si alguien necesita más ayuda con respecto a lo que exactamente quiero hacer, puede acceder a este enlace "http://www.random-ideas.net/posts/42" –

Tengo una biblioteca que puede hacer esto exactamente lo relacionado aquí: Extracting pdf text in Objective C

Fuente

2010-07-14 20:09:56 zachron

Mire cómo lo hace la aplicación de ejemplo QuartzDemo, específicamente la clase QuartzPDFView en los archivos QuartzImages.h y QuartzImages.m. Muestra un ejemplo de carga de un PDF a través de Quartz.

Fuente

2010-03-03 18:06:15

¡Sí! Lo he intentado, he editado más en mi pregunta. Por favor consulte. Solo quiero cadenas de pdf y Quartz está dando la imagen. –

Si desea extraer parte del contenido de un archivo PDF, entonces es posible que desee leer lo siguiente:

Parsing PDF Content

de la guía de programación Quartz 2D.

Básicamente, utilizará un objeto CGPDFScanner para analizar el contenido, que funciona de la siguiente manera. Usted registra algunas devoluciones de llamada que se invocarán automáticamente por Quartz 2D al encontrar algunos operadores de PDF en la secuencia de pdf. Después de este paso inicial, entonces realmente comienzas a analizar el flujo de PDF.

Al echar un vistazo a su código, parece que no está siguiendo los pasos necesarios para analizar el contenido en formato PDF de la página que obtiene a través del CGPDFDocumentGetPage(). Primero necesita configurar las devoluciones de llamada usando CGPDFOperatorTableCreate() y CGPDFOperatorTableSetCallback(), luego obtiene la página, necesita crear una secuencia de contenido usando esa página (usando CGPDFContentStreamCreateWithPage()) y luego crear una instancia de CGPDFScanner a través de CGPDFScannerCreate() y realmente comienza a escanear a través de CGPDFScannerScan().

La sección "Análisis del contenido PDF" del documento señalado por la URL anterior le brinda toda la información necesaria para implementar el análisis de PDF.

Espero que esto ayude.

Fuente

2010-03-04 11:42:56

He editado mi pregunta. - Ver, ya he agregado métodos para eso. y también he intentado escanear cada página cuando se está cargando. Pero las claves CGPDFDictionary: ¿cómo alguien puede conocer el tiempo de ejecución? –

seguí tus consejos, pero ¿cómo puedo obtener los datos escaneados? – jongbanaag

Lectura de archivos PDF como una cadena a través de la aplicación de iPhone

Respuesta

Cuestiones relacionadas