Búsqueda de una secuencia de Bytes en un archivo binario con Java

Tengo una secuencia de bytes que debo buscar en un conjunto de archivos binarios utilizando Java.Búsqueda de una secuencia de Bytes en un archivo binario con Java

Ejemplo: Estoy buscando la secuencia de bytes DEADBEEF (en hexadecimal) en un archivo binario. ¿Cómo haré esto en Java? ¿Hay un método incorporado, como String.contains() para archivos binarios?

Fuente

2009-10-02 Bassam

No, no hay un método incorporado para hacerlo. Pero, directamente copiado de HERE (con dos arreglos aplicados al código original):

/** 
* Knuth-Morris-Pratt Algorithm for Pattern Matching 
*/ 
class KMPMatch { 
    /** 
    * Finds the first occurrence of the pattern in the text. 
    */ 
    public int indexOf(byte[] data, byte[] pattern) { 
     int[] failure = computeFailure(pattern); 

     int j = 0; 
     if (data.length == 0) return -1; 

     for (int i = 0; i < data.length; i++) { 
      while (j > 0 && pattern[j] != data[i]) { 
       j = failure[j - 1]; 
      } 
      if (pattern[j] == data[i]) { j++; } 
      if (j == pattern.length) { 
       return i - pattern.length + 1; 
      } 
     } 
     return -1; 
    } 

    /** 
    * Computes the failure function using a boot-strapping process, 
    * where the pattern is matched against itself. 
    */ 
    private int[] computeFailure(byte[] pattern) { 
     int[] failure = new int[pattern.length]; 

     int j = 0; 
     for (int i = 1; i < pattern.length; i++) { 
      while (j > 0 && pattern[j] != pattern[i]) { 
       j = failure[j - 1]; 
      } 
      if (pattern[j] == pattern[i]) { 
       j++; 
      } 
      failure[i] = j; 
     } 

     return failure; 
    } 
}

Fuente

2009-10-02 05:11:13 janko

Me encanta StackOverflow. ¡Gracias! – Teekin

Muy poca optimización: no necesita calcular la función de falla del patrón si el data.length es cero ==> puede mover la verificación data.length cero a la primera línea de la función. – dexametason

private int bytesIndexOf(byte[] source, byte[] search, int fromIndex) { 
    boolean find = false; 
    int i; 
    for (i = fromIndex; i < (source.length - search.length); i++) { 
     if (source[i] == search[0]) { 
      find = true; 
      for (int j = 0; j < search.length; j++) { 
       if (source[i + j] != search[j]) { 
        find = false; 
       } 
      } 
     } 
     if (find) { 
      break; 
     } 
    } 
    if (!find) { 
     return -1; 
    } 
    return i; 
}

Fuente

2011-01-12 04:18:36 joseluisbz

No funcionará en el último byte de cadena. –

Fuente

2015-03-17 14:43:04

¿Dónde colocar la limitación de 1024 bytes para el patrón según lo indicado por el miembro MAX_PATTERN_LENGTH no utilizado? – user1767316

puede encontrar la secuencia de bytes de giga-bytes archivo de pedidos utilizando bigdoc.

Lib y el ejemplo aquí en Github en: https://github.com/riversun/bigdoc

package org.example; 

import java.io.File; 
import java.util.List; 

import org.riversun.bigdoc.bin.BigFileSearcher; 

public class Example { 

    public static void main(String[] args) throws Exception { 

     byte[] searchBytes = "hello world.".getBytes("UTF-8"); 

     File file = new File("/var/tmp/yourBigfile.bin"); 

     BigFileSearcher searcher = new BigFileSearcher(); 

     List<Long> findList = searcher.searchBigFile(file, searchBytes); 

     System.out.println("positions = " + findList); 
    } 
}

Si desea buscar en la memoria, que mira esto. ejemplos aquí en Github especializados: https://github.com/riversun/finbin

import java.util.List; 

import org.riversun.finbin.BigBinarySearcher; 

public class Example { 

    public static void main(String[] args) throws Exception { 

     BigBinarySearcher bbs = new BigBinarySearcher(); 

     byte[] iamBigSrcBytes = "Hello world.It's a small world.".getBytes("utf-8"); 

     byte[] searchBytes = "world".getBytes("utf-8"); 

     List<Integer> indexList = bbs.searchBytes(iamBigSrcBytes, searchBytes); 

     System.out.println("indexList=" + indexList); 
    } 
}

Devuelve todas las posiciones compensadas en la matriz de bytes

También pueden soportar una gran variedad de bytes :)

Fuente

2015-07-09 05:00:50 riversun

Búsqueda de una secuencia de Bytes en un archivo binario con Java

Respuesta

Cuestiones relacionadas