Cleaning Reaction Times | Zach Shipstead

Cleaning RT Data

KEY:

ds = dataset you are currently using.

DV = dependent variable of interest

IV = independent variable of interest

Subject = name of subject number column

XYXY = dummy name for a variable, matrix, or data frame into which you are moving information.

Topics:

Create a column that contains RTs for accurate trials only
Remove trials on the basis of some criteria
*Replace trial scores that are more than 2.5 standard deviations from a subject's mean score

Create a column that contains RTs for accurate trials only

You are creating a column named "Trim" that will contain the contents of a preexisting column named "RT".

This function will use another preexisting column named "ACC" (1 = correct response, 0 = incorrect response) to decide which trials will get moved into "Trim". You should change ACC to match the name in your dataset.

"[ds$ACC==1]" tells R "only do this in cases where the ACC column contains a 1".

_____________________________

ds$Trim[ds$ACC==1] <- ds$RT[ds$ACC==1]

_____________________________

Anchor 1

Remove trials on the basis of some criteria

You may want to get rid of trials on which impossible responses were made (in this case, RT < 100) and trials on which no response was made.

Step 1: Create a column that will eventually contain a "1" for any row that needs removal. "0" means the row is safe.

_____________________________

ds$drop <-0

_____________________________

Step 2: Change "0" to "1" for any row that meets the removal criteria.

This command assumes there are columns named RT (reaction time) and RESP (response made). Adjust these names in accordance with your dataset.

_____________________________

ds$drop[ds$RT <100 | is.null(ds$RESP)] <-1

_____________________________

Notes:

* "|" means "or". Use "&" if you want the commands to function as a compound-criterion.

* [ ] after a column name effectively says "only mark as 1 when one of these conditions is met.

Step 3: Count the number of cases that were tagged for removal.

You will need the function "count" from the "plyr" package.

_____________________________

library(plyr)

RemovalCount <- count(ds$drop)

_____________________________

If you want removed items to be expressed as a percentage, type...

_____________________________

RemovalCount[2,2] / (RemovalCount[1,2] + RemovalCount[2,2])

_____________________________

Step 4: Make a new dataset that only contains rows where drop = 0

_____________________________

XYXY <- ds[ds$drop != 1,]

_____________________________

Notes:

* If you get an error "Error in `[.data.frame`(ds$Drop != 1) : undefined columns selected", then you probably forgot the ","

* [ds$drop != 1,] says "If drop not equal to 1"

Replace trial scores that are more than 2.5 standard deviations from a subject's mean score

Step 1. Generate values

First you need to generate mean and standard deviation scores for each participant (here called "SubjMean" and "SubjSD").

This can be accomplished via the "ave" command. Note that "ds$RT" refers to your preexisting column of RT values. Adjust the name as needed.

Then you create columns that represent the upper and lower bounds for replacement (here called "Upper" and "Lower".

_____________________________

ds$SubjMean <- ave(ds$RT, ds$Subject, FUN=mean)
ds$SubjSD <- ave(ds$RT, ds$Subject, FUN=sd)
ds$Upper <- ds$SubjMean + (2.5 * ds$SubjSD)
ds$Lower <- ds$SubjMean - (2.5 * ds$SubjSD)

_____________________________

Step 2. Replace Scores

Make a new column called "Trim" (or whatever you want) in which your preexisting column of rt scores (ds$RT) can be copied.

Next, replace any cases where "Trim" is greater than the upper bound, or less than the lower bound.

_____________________________

ds$Trim <- ds$RT
ds$Trim[ds$Trim > ds$Upper] <- ds$Upper[ds$Trim > ds$Upper]
ds$Trim[ds$Trim < ds$Lower] <- ds$Lower[ds$Trim < ds$Lower]

_____________________________

Step 3. Get count of number of replaced trials.

Make a column named "RTcount" (or whatever) and set it to zero.

Set RTcount to 1 any time your new TRIM column does not match your preexisting RT column.

_____________________________

ds$RTcount <- 0
ds$RTcount[ds$RT != ds$Trim] <- 1
library(plyr)
count(ds$RTcount)

_____________________________

Anchor 2

Anchor 3

Cleaning RT Data

Topics:

*Create a column that contains RTs for accurate trials only *Remove trials on the basis of some criteria *Replace trial scores that are more than 2.5 standard deviations from a subject's mean score

Create a column that contains RTs for accurate trials only

Remove trials on the basis of some criteria

Replace trial scores that are more than 2.5 standard deviations from a subject's mean score

Create a column that contains RTs for accurate trials only
Remove trials on the basis of some criteria
*Replace trial scores that are more than 2.5 standard deviations from a subject's mean score