Sunday, March 29, 2015

R - Find Non-Matching rows in-between 2 text files


Let's say we have generated a 2 csv from different runs from the mentioned link.
Created List of FD and FC from each of GDB to a csv file.

This will create 2 csv file and each csv file will have missing rows of GDB, FD, FC from comparison between 2 files one at time.

Here is the R Code.....

setwd("J:/Thumb drive/Technical/GIS Exercise/Chapters/Chapter 14 - R by example in Statistics & Geospatial Analysis/R/reneedyourhelpmann/Incomplete/Non Matching Rows/")

# Read data and concatenate fields
f1 <- read.csv("GDB_output_More.txt", header=TRUE, na.strings = " ")
f1.cat <- paste( f1[,1], f1[,2], f1[,3], sep="-" )
f2 <- read.csv("GDB_output_Less.txt", header=TRUE, na.strings = " ")
f2.cat <- paste( f2[,1], f2[,2], f2[,3], sep="-" )

# Create vector of "non-matching" values
( x <- f1.cat[-which(f1.cat %in% f2.cat)] )

# Create dataframe of results matching columns of source data
missing <- as.data.frame(t(unlist(strsplit(x[1], "-"))))
for(i in 2:length(x)) {
  missing <- rbind(missing, as.data.frame(t(unlist(strsplit(x[i], "-")))))
}
names(missing) <- names(f1)
missing

# Write csv of results
write.csv(missing, "MissingData.csv", row.names=FALSE, quote=FALSE)

No comments:

Post a Comment