Material
to be covered in Psyc 2023
The following is meant as
an indication of the material I expect we will cover this semester. (Sometimes we do not get through it all.) Since this material all builds upon the
material covered in Psyc 2013, some of this will be review. It will be
assumed that you have mastered that material, however, whether reviewed by me in Psyc 2023 or not. I have tentatively linked topics to classes, however,
we will go through this as quickly, or as slowly as necessary. Changes
to the schedule will be noted in class.
Some Problems with probabilistic
reasoning: (adapted from Stanovitch)
1) man who statistics (salience
of individual cases)
2) insufficient use of
probabilistic information (Baye's theorem)
3) cognitive illusions
4) failure use sample
size information
5) tendency to explain chance
events
6) gambler's fallacy
(tendency to see independent events as dependent)
7) conjunction fallacy
Four factors to consider
in evaluating external validity:
1) Population studied
and how sampled
2) Operational Definitions
3) Parameter values
(both independent and control variables)
4) Demand characteristics
(both internal & external)
other threats to validity:
statistical validity
Power: avoidance of type
II (b error) depends upon:
1) increases as the probability
of making type I error increases (trade off between the two)
2) increases as the magnitude
of the hypothesized effect increases
3) increases as the size
of the sample increases (with n=102, an r of ± .16 is significant
at a =.05)
Simple Analytic experiments:
involve control, manipulation & measurement: importance of operational
definitions
"intro psyc" problem: a
legitimate problem, but often overstated
-
convenient
-
good for a first study (at least),
-
in much basic research the subject
population studied is not important
-
replication occurs among other
undergraduate populations
-
not wrong, just incomplete (external
validity questionable)
Types of research methods in
psychology (and their strengths and weaknesses)
-
trend to move toward experimental
methods from case studies and/or observational or survey methods to correlational
to experimentation
Categorizing experimental designs:
terminology: bivalent, multivalent, between and within subject designs
(repeated-measures design): strengths and weaknesses
-
within: increases power, statistical
reliability and internal validity use when too few subjects or measures
-
concerns with order (carryover)
effects: (due to: learning, fatigue, habituation, sensitization, contrast
or adaptation)
-
counteract by:
counterbalancing (if complete
order can be an IV)
randomized treatment order
(and/or block randomization)
latin square (steps in creating)
terminology: independent,
dependent, intervening, control, and confounding (or "extraneous") variables
-
Multiple dependent variables
often used, assesses generalizability
Inferential Statistics:
used to assess the reliability (statistical significance) of the results
[probability that observed differences between groups (or among groups)
occurred by chance alone (i.e., H0)
Logic of inferential
statistics: comparing the variability between groups with the variability
within groups (e.g., t-test and F-ratio)
measurement & sampling error,
and natural variation
- influenced by: i) degree
of control over environmental factors
ii) subject differences
iii) sample size
-
between group variance: same,
but also includes effects of the IV
- influenced by: i) strength
of IV,
ii) level of treatment,
iii) sensitivity of the
DV's operational definition
Errors: type I (a - alpha),
type II (b - beta): how controlled
1 tailed vs 2-tailed tests
(decide a priori)
Between and within subjects
t-test (pooled and unpooled error calculations)
Many different inferential
statistics: Each intended for specific conditions
-
type of measure (nominal,
ordinal, interval of ratio)
-
shape of the distributions
-
experimental design
Nonparametric statistics:
e.g., Mann-Whitney U-test, sign test
Analysis of Variance (ANOVA)
ANOVA followed by t-tests
only if the ANOVA is significant (importance of placebo controls)
ANOVA: partitions variance:
i) subject variables, ii) experimental error, iii) value of the IV
-
significant ANOVA indicates
that some of the differences among groups did not occur by chance
Factorial Experiments
-
3 stages: 1) identify each causal
factor of interest, 2) decide how many levels of each, 3) determine all
possible combinations
-
terminology: factor = independent
variable
-
can examine both the main effects
(by collapsing across the other IVs) and interactions: with a 2 factor
design, have 3 independent questions to answer: main effects for IV1
and IV2 and the interaction between them (gives 8 possible outcomes).
If any are significant the direction or nature of the differences must
be described
-
if there is (are) no interaction
then the results are said to be additive
-
these are still analytic experiments:
characterized (ideally) by: random selection of subjects, random assignment
of subjects to groups, and concurrent contrast and control with a measured
DV.
Advantages: can address
the complexities of the social sciences (can examine interactive effects
among multiple factors): more ecologically valid & economical.
Factorial designs: #levels
IV1 x #levels IV2 x #levels IV3àetc.
-
Review of terminology: independent,
dependent, intervening, control, and confounding variables
-
Interpreting data from 3 factor
experiments and/or experiments with three or more levels of one or more
IV (e.g., in class 3x3 experiment on impression formation: occupation x
personality), Tomlinson et al (1978) pupil size experiment
-
Fazio & Backler: software
tutorial on main effects and interaction (available in library)
Developmental Research Designs
methods with age or time as a variable
Three specific experimental
designs.
-
longitudinal designs
-
cross-sectional designs
-
cohort-sequential designs (cross-sequential
in text)
Cohort: a group of individuals
with common experiences (e.g., born the same time)
Habituation-Dishabituation
techniques a paradigm technique in infancy research
-
if show dishabituation,
must be able to discriminate
-
asymmetries in dishabituation
can reveal preferences
-
research on categorical perception
of colours (Bornstein) and phonemes (speech sounds)
Other Quasi-Analytic Designs:
Ex-post-facto designs (after the fact)
WHY?: ethical reasons
or an interest in organismic variables
-
prospective and retrospective
designs
-
find naturally occurring groups
and follow them forward (prospective) or trace their histories (retrospective)
Problems: not randomly assigned:
inherent confounds in the populations studied
sampling problems
(often a convenient sample):
dropouts in prospective
studies
detection bias (equally
likely to detect in both groups?)
Partial solutions:
Matching: 1) subject
for subject (preferable but more difficult) or 2) distribution by distribution
-
in both cases can selectively
drop individuals and bias the sample further
Measuring: so will know
if potential confounds (uncontrolled variables) are confounded, and to
statistically control for these variables
Retrospective studies
also have additional problems in that they rely on memory so the partial
solutions are more difficult to employ successfully
-
more efficient (cheaper and
faster) May be necessary with very rare grouping variables of interest
(e.g., rare diseases)
-
even with measurement and
matching, internal validity is still questionable (The additional problems
of retrospective designs were well illustrated by McFarland's (1988) study
of cyclical variability in moods)
DVs used in Ex-post-facto
studies
-
relative risk ratio (prospective
studies) - illustrated by breast cancer data
-
relative odds ratio (approximates
the relative risk) -- retrospective studies
Problem with both in that absolute
risks are hidden, both should be reported
Causality and ex-post-facto
designs. No one quasi-analytic experiment will unambiguously show a causal
relationship, however, with converging evidence from many such studies
(5, 10 or 100?) can make causal statements (like "smoking causes cancer")
Subject Sampling
-
types: random, stratified,
proportional, systematic, cluster (multistage)
-
Goal is an economic sample:
big enough to ensure a valid sample (and sufficient power) and no more
-
Volunteers
Time-series designs, small-n
designs
-
A-B studies (e.g., homicides
after prize fights, JFKÆs assassination, TV effects etc.)
-
multiple baseline designs (can
be used both within and between subjects)
-
accounting for drifting baselines
-
problems of internal validity
(due to the confounding passage of time)
non-equivalent control group
replication within-subjects
(e.g., A-B-A-B-A-B¼ designs)
generalizability can be
indicated by having all subjects show the same pattern
Quasi-analytic experiments:
Bivalent correlation designs
for correlations: 1) select
population and subjects of interest; 2) measure two variables of interest;
3) calculate the extent to which the two variables are systematically related
Graph data (scatterplot):
predictor (assumed causal or IV) variable on abscissa (X-axis) and criterion
or DV on ordinate (Y-axis)
Pearson's product moment
correlation coefficient (for Interval or ratio data) measures the direction
and degree of association.
r is the mean of z-score
crossproducts: r=S (ZxZy)/N, the extent to which
deviations from the average on each measure are similar for each subject
sampled
-
statistical inference:
for a given sample size: larger the absolute value of r, the less likely
it is to have occurred by chance, similarly, for a given value of r, the
larger the sample, the less likely it was to have occurred by chance.
-
r2 (coefficient of
determination) = estimate of the proportion of variance shared by the two
variables; extent to which they covary
Linear regression: -
looks at the correlation in terms of Predictability,
r2 is a measure
of the variance in Y accounted for (or predicted by) X.
1-r2: coefficient
of nondetermination (also called coefficient of alienation or error variance)
Linear regression finds
the best fitting line: Y'=a+bx
(minimizes the sum of squared
deviations, sum of deviations between predicted values of y' and actual
observed values of y =0. these deviations are called residuals)
-
with standard scores and 1 IV,
regression coefficient (b) = correlation coefficient (r) [r=b(sx/sy)]
-
as the correlation grows less
strong, Y' moves less in response to a given change in X,
-
if r=0, best predictor of Y
from X is the mean of Y, and the best predictor of X from Y is the mean
of X
cautions - assumes linear
relations among variables, truncated ranges can reduce correlations or
regressions, Pearson's r (based on means) is very sensitive to the presence
of outliers, heteroscedasticity (rXY relationship may vary across
levels of X), combining group data can influence the size of the correlation.
So: examine scatterplots!!
Problems interpreting
the results of this type of research: third variable problem and directionality
(not always an issue), regression artifact (e.g., Rushton), floor and ceiling
effects, look for converging evidence
Correlation versus ex-post
facto design: similar and can convert one to the other [e.g., assign
dummy coding to the categorical (nominal) variable and calculate a point-biserial
correlation coefficient]
Interpretation problems
are not related to the statistical choice, rather due to the design
Causation not a simple
concept: want to have:
1) an association
between variables that recurs in different contexts (replication,
convergent evidence),
2) have a plausible explanation
showing how the predictor variable could cause the criterion variable,
and
3) have no equally plausible
3rd variable that could cause the variance in the criterion variable.
While correlation doesn't
imply causation, causation does imply correlation
Simpson's Paradox: when
two groups are classified on some attribute (as in many ex-post facto designs),
then are separated into subcategories, the group with the higher
incidence (or scores) overall can have the same or even lower incidence
within every one of the subcategories. e.g., 1) salaries and economics
degrees, 2) race and imposition of death penalty
Partial correlation rYX2×
X1: allows you to examine the relationship between 2 variables
with the effect of the third removed from both. Can be viewed as
the average of the simple bivariate correlations across levels of the third,
"nuisance" variable partialled out. variable.
-
removes the systematic relationship
statistically, by removing the linear trend, then correlate residuals.
-
any number of measured variables
can be partialed out as if controlled for experimentally
-
partial correlation can be tested
for statistical significance with n-j d.f. (where j= number of variables)
Remember: can be other confounding
variable not measured
Semipartial correlations
(sometimes called Part correlations) rY(X2× X1):allows
you to examine the relationship between 2 variables with the effect
of the third removed from one.. (see later handout on multiple regression)
Multiple Correlation and
Regression (see later handout)
Discrete trials designs
- Psychophysics
-
Psychophysics is concerned with
four problems of : detection, identification, discrimination, and scaling
-
absolute thresholds:
method of limits (ascending
& descending series)
staircase method: advantages
in tracking changes in sensitivity, more efficient
method of constant stimuli
-
S-shaped curves: called ogives
-
operational definitions of thresholds:
P(yes)=.5
-
thresholds vary within and between
modalities
Signal Detection Theory
a mathematical, theoretical
system that recognizing that observers are not merely passive receivers
of stimuli, are also engaged in process of deciding whether they
are confident enough to say they detect a signal.
-
detecting the signal
from the noise
-
teasing apart sensory ability
(i.e. delectability or sensitivity) and decision to say so (response
bias).
-
catch trials: reveals 2x2 matrix
(outcome matrix): four possible outcomes: hit, miss, false alarm,
correct negative
-
Relations among these depends
upon, 1) the nature of the stimulus and sensory ability, & 2) the subject's
decision process
-
can vary expectations
or costs/benefits of outcomes (payoff matrix) to alter hit and false
alarm rates (by varying the decision process without altering sensory ability
or stimuli strength). If hits increase, so will false alarms
-
Isosensitivity curves
also called ROC (receiver operating characteristic) curves: more sharply
bowed, more sensitive
-
can also assume noise is
normally distributed (with a variance of 1), so too will be the signal
plus noise distribution (on average more intense, so shifted to the right)
-
where distributions overlap,
subjects must guess
-
subjects select criterion
(b ):say yes if and only if signal strength exceeds this (location has
no effect on sensitivity)
-
distance between distributions
(in z-score units) = d' (sensitivity)
-
calculate d' from hit and false
alarm probabilities (using tables of areas under the normal curve)
-
Importance of these techniques,
e.g., study on acupunctural analgesia (Clark & Yang, 1974).
Scientific theories: types of
theories, functions of theories
Evaluation on the basis
of : parsimony, testability, precision
Confirming vs. disconfirming
strategies (confirmational bias)