Cleaning RT Data
KEY:
ds = dataset you are currently using.
​
DV = dependent variable of interest
​
IV = independent variable of interest
Subject = name of subject number column
​
XYXY = dummy name for a variable, matrix, or data frame into which you are moving information.
Topics:
*Create a column that contains RTs for accurate trials only
*Remove trials on the basis of some criteria
*Replace trial scores that are more than 2.5 standard deviations from a subject's mean score
Create a column that contains RTs for accurate trials only
You are creating a column named "Trim" that will contain the contents of a preexisting column named "RT".
This function will use another preexisting column named "ACC" (1 = correct response, 0 = incorrect response) to decide which trials will get moved into "Trim". You should change ACC to match the name in your dataset.
​
"[ds$ACC==1]" tells R "only do this in cases where the ACC column contains a 1".
_____________________________
ds$Trim[ds$ACC==1] <- ds$RT[ds$ACC==1]
_____________________________
​
​
Remove trials on the basis of some criteria
You may want to get rid of trials on which impossible responses were made (in this case, RT < 100) and trials on which no response was made.
​
Step 1: Create a column that will eventually contain a "1" for any row that needs removal. "0" means the row is safe.
_____________________________
ds$drop <-0
_____________________________
​
​
Step 2: Change "0" to "1" for any row that meets the removal criteria.
​
This command assumes there are columns named RT (reaction time) and RESP (response made). Adjust these names in accordance with your dataset.
​
_____________________________
ds$drop[ds$RT <100 | is.null(ds$RESP)] <-1
_____________________________
Notes:
* "|" means "or". Use "&" if you want the commands to function as a compound-criterion.
* [ ] after a column name effectively says "only mark as 1 when one of these conditions is met.
​
​
Step 3: Count the number of cases that were tagged for removal.
​
You will need the function "count" from the "plyr" package.
_____________________________
library(plyr)
RemovalCount <- count(ds$drop)
_____________________________
​
If you want removed items to be expressed as a percentage, type...
_____________________________
RemovalCount[2,2] / (RemovalCount[1,2] + RemovalCount[2,2])
_____________________________
​
​
Step 4: Make a new dataset that only contains rows where drop = 0
_____________________________
XYXY <- ds[ds$drop != 1,]
_____________________________
Notes:
* If you get an error "Error in `[.data.frame`(ds$Drop != 1) : undefined columns selected", then you probably forgot the ","
* [ds$drop != 1,] says "If drop not equal to 1"
​
​
​
Replace trial scores that are more than 2.5 standard deviations from a subject's mean score
Step 1. Generate values
​
First you need to generate mean and standard deviation scores for each participant (here called "SubjMean" and "SubjSD").
​
This can be accomplished via the "ave" command. Note that "ds$RT" refers to your preexisting column of RT values. Adjust the name as needed.
​
Then you create columns that represent the upper and lower bounds for replacement (here called "Upper" and "Lower".
_____________________________
ds$SubjMean <- ave(ds$RT, ds$Subject, FUN=mean)
ds$SubjSD <- ave(ds$RT, ds$Subject, FUN=sd)
ds$Upper <- ds$SubjMean + (2.5 * ds$SubjSD)
ds$Lower <- ds$SubjMean - (2.5 * ds$SubjSD)
_____________________________
​
​
Step 2. Replace Scores
​
Make a new column called "Trim" (or whatever you want) in which your preexisting column of rt scores (ds$RT) can be copied.
​
Next, replace any cases where "Trim" is greater than the upper bound, or less than the lower bound.
_____________________________
ds$Trim <- ds$RT
ds$Trim[ds$Trim > ds$Upper] <- ds$Upper[ds$Trim > ds$Upper]
ds$Trim[ds$Trim < ds$Lower] <- ds$Lower[ds$Trim < ds$Lower]
_____________________________
​
​
Step 3. Get count of number of replaced trials.
​
Make a column named "RTcount" (or whatever) and set it to zero.
​
Set RTcount to 1 any time your new TRIM column does not match your preexisting RT column.
_____________________________
ds$RTcount <- 0
ds$RTcount[ds$RT != ds$Trim] <- 1
library(plyr)
count(ds$RTcount)
_____________________________