? Tengo un problema bastante difícil, que parece que no puedo resolver.¿Cómo calculo el número de ocurrencias de un valor particular en una fila en R
Tengo un conjunto de datos grande (23277 filas, 151 columnas). Cada columna tiene valores de 0: 100 (inclusive) que representan probabilidades asignadas para eventos en el mundo.
Como parte del cálculo del puntaje para cada individuo, necesito contar las ocurrencias de cada uno de los valores en el conjunto de datos.
primera vez que trató de aplicar, pero necesito hacer caso omiso de NA, y el subconjunto, así que cuando he intentado lo siguiente:
apply(ans.samp, 1, sum(ans.samp[ans==0]), na.rm=TRUE)
me dieron el mensaje de error: suma ([== 0 ans] ans.samp) 'no es una función, carácter o símbolo
He repetido este proceso con sapply, vapply, tapply y do.call inútilmente.
Renunciar a una solución vectorizada, escribí lo siguiente para el ciclo.
RespCount <- function (x) { for (i in (1:nrow(x)))
{ res <- vector(mode="numeric", length=nrow(x))
ans.tmp <- x[i,]
res[i] <- length(ans.tmp[ans.tmp==0])
print(res)
}
return(res)
}
Sin embargo, después de obtener este trabajo, devuelve solo la suma total de O en la muestra.
Agradecería algo de ayuda con esto, ya que estoy bajo presión de tiempo, y me gustaría ser capaz de resolver este tipo de problemas en R en el futuro.
datos de muestra incluidos para reproducibilidad:
structure(list(X = 1:6, X100 = c(70L, NA, 80L, 0L, 40L, NA),
X10 = c(30L, NA, NA, NA, NA, NA), X1 = c(50L, NA, NA, NA,
NA, NA), X11 = c(50L, NA, NA, NA, NA, NA), X12 = c(30L, NA,
NA, NA, NA, NA), X13 = c(50L, NA, NA, NA, NA, NA), X14 = c(70L,
NA, NA, NA, NA, NA), X15 = c(60L, NA, NA, NA, NA, NA), X158 = c(30L,
NA, NA, NA, NA, NA), X159 = c(50L, NA, NA, NA, NA, NA), X160 = c(80L,
NA, NA, NA, NA, NA), X16 = c(50L, NA, NA, NA, NA, NA), X161 = c(40L,
NA, NA, NA, NA, NA), X162 = c(100L, NA, NA, NA, NA, NA),
X163 = c(50L, NA, NA, NA, NA, NA), X164 = c(0L, NA, NA, NA,
NA, NA), X165 = c(0L, NA, NA, NA, NA, NA), X166 = c(20L,
NA, NA, NA, NA, NA), X167 = c(0L, NA, NA, NA, NA, NA), X168 = c(30L,
NA, NA, NA, NA, NA), X169 = c(100L, NA, NA, NA, NA, NA),
X170 = c(30L, NA, NA, NA, NA, NA), X17 = c(40L, NA, NA, NA,
NA, NA), X171 = c(50L, NA, NA, NA, NA, NA), X172 = c(20L,
NA, NA, NA, NA, NA), X173 = c(30L, NA, NA, NA, NA, NA), X174 = c(20L,
NA, NA, NA, NA, NA), X175 = c(30L, NA, NA, NA, NA, NA), X176 = c(10L,
NA, NA, NA, NA, NA), X177 = c(70L, NA, NA, NA, NA, NA), X178 = c(40L,
NA, NA, NA, NA, NA), X179 = c(70L, NA, NA, NA, NA, NA), X180 = c(0L,
NA, NA, NA, NA, NA), X18 = c(30L, NA, NA, NA, NA, NA), X181 = c(100L,
NA, NA, NA, NA, NA), X182 = c(100L, NA, NA, NA, NA, NA),
X183 = c(20L, NA, NA, NA, NA, NA), X184 = c(80L, NA, NA,
NA, NA, NA), X185 = c(90L, NA, NA, NA, NA, NA), X186 = c(0L,
NA, NA, NA, NA, NA), X187 = c(10L, NA, NA, NA, NA, NA), X188 = c(100L,
NA, NA, NA, NA, NA), X189 = c(100L, NA, NA, NA, NA, NA),
X190 = c(0L, NA, NA, NA, NA, NA), X19 = c(100L, NA, NA, NA,
NA, NA), X191 = c(0L, NA, NA, NA, NA, NA), X192 = c(90L,
NA, NA, NA, NA, NA), X193 = c(50L, NA, NA, NA, NA, NA), X194 = c(100L,
NA, NA, NA, NA, NA), X195 = c(10L, NA, NA, NA, NA, NA), X196 = c(100L,
NA, NA, NA, NA, NA), X197 = c(20L, NA, NA, NA, NA, NA), X198 = c(40L,
NA, NA, NA, NA, NA), X199 = c(20L, NA, NA, NA, NA, NA), X200 = c(0L,
NA, NA, NA, NA, NA), X20 = c(0L, NA, NA, NA, NA, NA), X201 = c(0L,
NA, NA, NA, NA, NA), X202 = c(20L, NA, NA, NA, NA, NA), X203 = c(20L,
NA, NA, NA, NA, NA), X204 = c(80L, NA, NA, NA, NA, NA), X205 = c(0L,
NA, NA, NA, NA, NA), X206 = c(80L, NA, NA, NA, NA, NA), X207 = c(0L,
NA, NA, NA, NA, NA), X2 = c(10L, NA, NA, NA, NA, NA), X21 = c(0L,
NA, NA, NA, NA, NA), X22 = c(100L, NA, NA, NA, NA, NA), X23 = c(50L,
NA, NA, NA, NA, NA), X24 = c(50L, NA, NA, NA, NA, NA), X25 = c(70L,
NA, NA, NA, NA, NA), X26 = c(60L, NA, NA, NA, NA, NA), X27 = c(40L,
NA, NA, NA, NA, NA), X28 = c(20L, NA, NA, NA, NA, NA), X29 = c(0L,
NA, NA, NA, NA, NA), X30 = c(90L, NA, NA, NA, NA, NA), X3 = c(0L,
NA, NA, NA, NA, NA), X31 = c(50L, NA, NA, NA, NA, NA), X32 = c(50L,
NA, NA, NA, NA, NA), X33 = c(0L, NA, NA, NA, NA, NA), X34 = c(50L,
NA, NA, NA, NA, NA), X35 = c(90L, NA, NA, NA, NA, NA), X36 = c(50L,
NA, NA, NA, NA, NA), X37 = c(60L, NA, NA, NA, NA, NA), X38 = c(40L,
NA, NA, NA, NA, NA), X39 = c(50L, NA, NA, NA, NA, NA), X40 = c(0L,
NA, NA, NA, NA, NA), X4 = c(50L, NA, NA, NA, NA, NA), X41 = c(90L,
NA, NA, NA, NA, NA), X42 = c(80L, NA, NA, NA, NA, NA), X43 = c(50L,
NA, NA, NA, NA, NA), X44 = c(80L, NA, NA, NA, NA, NA), X45 = c(80L,
NA, NA, NA, NA, NA), X46 = c(0L, NA, NA, NA, NA, NA), X47 = c(80L,
NA, NA, NA, NA, NA), X48 = c(20L, NA, NA, NA, NA, NA), X49 = c(100L,
NA, NA, NA, NA, NA), X50 = c(0L, NA, NA, NA, NA, NA), X5 = c(0L,
NA, NA, NA, NA, NA), X51 = c(80L, 100L, 70L, 100L, 0L, 60L
), X52 = c(10L, 0L, 0L, 0L, 0L, 20L), X53 = c(40L, 40L, 70L,
20L, 90L, 50L), X54 = c(0L, 10L, 0L, 50L, 50L, 0L), X55 = c(20L,
80L, 90L, 80L, 30L, 0L), X56 = c(100L, 100L, 50L, 100L, 80L,
100L), X57 = c(60L, 0L, 100L, 70L, 100L, 80L), X58 = c(100L,
100L, 100L, 50L, 100L, 100L), X59 = c(80L, 50L, 80L, 0L,
30L, 50L), X60 = c(70L, 50L, 60L, 50L, 100L, 100L), X6 = c(100L,
NA, NA, NA, NA, NA), X61 = c(50L, 50L, 50L, 30L, 70L, 50L
), X62 = c(20L, 50L, 40L, 40L, 50L, 100L), X63 = c(50L, 0L,
100L, 10L, 50L, 100L), X64 = c(60L, 30L, 0L, 50L, 50L, 50L
), X65 = c(50L, 50L, 70L, 80L, 50L, 50L), X66 = c(70L, 40L,
10L, 90L, 60L, 50L), X67 = c(30L, 50L, 50L, 0L, 50L, 60L),
X68 = c(30L, 0L, 0L, 40L, 70L, 80L), X69 = c(30L, NA, 70L,
10L, 0L, 20L), X70 = c(80L, NA, 50L, 50L, 70L, 100L), X7 = c(100L,
NA, NA, NA, NA, NA), X71 = c(70L, NA, 50L, 100L, 100L, 100L
), X72 = c(60L, NA, 70L, 50L, 80L, 50L), X73 = c(80L, NA,
80L, 80L, 80L, NA), X74 = c(50L, NA, 50L, 0L, 50L, NA), X75 = c(30L,
NA, 70L, 10L, 80L, NA), X76 = c(70L, NA, 40L, 80L, 100L,
NA), X77 = c(80L, NA, 50L, 100L, 40L, NA), X78 = c(80L, NA,
0L, 0L, 0L, NA), X79 = c(80L, NA, 50L, 50L, 50L, NA), X80 = c(40L,
NA, 90L, 70L, 60L, NA), X8 = c(50L, NA, NA, NA, NA, NA),
X81 = c(70L, NA, 60L, 40L, 80L, NA), X82 = c(80L, NA, 100L,
60L, 60L, NA), X83 = c(30L, NA, 100L, 30L, 0L, NA), X84 = c(80L,
NA, 0L, 60L, 100L, NA), X85 = c(80L, NA, 50L, 40L, 30L, NA
), X86 = c(50L, NA, 90L, 50L, 50L, NA), X87 = c(80L, NA,
50L, 70L, 20L, NA), X88 = c(40L, NA, 70L, 30L, 90L, NA),
X89 = c(50L, NA, 50L, 80L, 80L, NA), X90 = c(90L, NA, 100L,
60L, 100L, NA), X91 = c(0L, NA, 0L, 0L, 0L, NA), X9 = c(100L,
NA, NA, NA, NA, NA), X92 = c(50L, NA, 70L, 90L, 80L, NA),
X93 = c(40L, NA, 50L, 50L, 50L, NA), X94 = c(40L, NA, 0L,
60L, 40L, NA), X95 = c(90L, NA, 100L, 40L, 50L, NA), X96 = c(50L,
NA, 50L, 50L, 50L, NA), X97 = c(60L, NA, 60L, 100L, 50L,
NA), X98 = c(40L, NA, 40L, 0L, 0L, NA), X99 = c(30L, NA,
0L, 50L, 70L, NA)), .Names = c("X", "X100", "X10", "X1",
"X11", "X12", "X13", "X14", "X15", "X158", "X159", "X160", "X16",
"X161", "X162", "X163", "X164", "X165", "X166", "X167", "X168",
"X169", "X170", "X17", "X171", "X172", "X173", "X174", "X175",
"X176", "X177", "X178", "X179", "X180", "X18", "X181", "X182",
"X183", "X184", "X185", "X186", "X187", "X188", "X189", "X190",
"X19", "X191", "X192", "X193", "X194", "X195", "X196", "X197",
"X198", "X199", "X200", "X20", "X201", "X202", "X203", "X204",
"X205", "X206", "X207", "X2", "X21", "X22", "X23", "X24", "X25",
"X26", "X27", "X28", "X29", "X30", "X3", "X31", "X32", "X33",
"X34", "X35", "X36", "X37", "X38", "X39", "X40", "X4", "X41",
"X42", "X43", "X44", "X45", "X46", "X47", "X48", "X49", "X50",
"X5", "X51", "X52", "X53", "X54", "X55", "X56", "X57", "X58",
"X59", "X60", "X6", "X61", "X62", "X63", "X64", "X65", "X66",
"X67", "X68", "X69", "X70", "X7", "X71", "X72", "X73", "X74",
"X75", "X76", "X77", "X78", "X79", "X80", "X8", "X81", "X82",
"X83", "X84", "X85", "X86", "X87", "X88", "X89", "X90", "X91",
"X9", "X92", "X93", "X94", "X95", "X96", "X97", "X98", "X99"), row.names = c(NA,
6L), class = "data.frame")
Cualquier penetración sería muy apreciada.
De algunos intentos en el pequeño conjunto de datos anterior, parece que el número se calcula para cada fila, pero cuando devuelvo el objeto res, simplemente me da el valor final. ¿Cómo puedo arreglar esto?
+1 para resolver el problema de codificación real – Vincent
no necesita una función anónima para aplicar, puede usar apply (mat, 1, foo, na.rm = TRUE) – mdsumner
@mdsumner Sí, tiene razón . Parece que mezclé cosas en algún momento. Sin embargo, como no pasé 'na.rm = T' como un parámetro (con valor predeterminado) a' foo() ', el código anterior no funcionará. Actualizaré mi respuesta para reflejar su buen punto. – chl