INTRODUCTION
A hypothesis is a preliminary or tentative explanation or postulate by
the researcher of what the researcher considers the outcome of an investigation
will be. It is an informed/educated guess. It indicates the expectations of the
researcher
regarding certain variables. It is the most specific way in
which an answer to a problem can be stated.
MEANING
Hypothesis means
a mere assumption or some supposition or a possibility to be proved or
disproved.
1. A tentative explanation for an
observation, phenomenon, or scientific problem that can be tested by further
investigation.
2. Something taken
to be true for the purpose of argument or investigation; an assumption. A statement that
explains or makes generalizations about a set of facts or principles, usually
forming a basis for possible experiments to confirm its viability.
DEFINITION
“A
hypothesis is a tentative generalization, the validity of which remains to be
tested.
-
George
A.Landberg.
WHEN IS AN HYPOTHESIS FORMULATED
An hypothesis is formulated after the problem has
been stated and the literature study has been concluded. It is formulated
when the researcher is totally aware of the theoretical and empirical
background to the problem.
THE PURPOSE AND FUNCTION OF AN HYPOTHESIS
- It
gives direction to an investigation.
- It
structures the next phase in the investigation and therefore furnishes
continuity to the examination of the problem.
CHARACTERISTICS
OF AN HYPOTHESIS
- It
must be verifiable.
- It
must be formulated in simple, understandable terms.
·
Hypothesis
should be clear and precise.
·
It
should be capable of being tested.
·
A
relational hypothesis should state relationship between variables.
·
It
should be specific and limited in scope.
·
It
should be consistent with most known facts.
·
It
should be amenable to testing within a reasonable time.
- An
important requirement for hypotheses is TESTABILITY.
- A
condition for testability is CLEAR nad UNAMBIGUOUS CONCEPTS.
OTHER CHARACTORS
- A good hypothesis
is based on sound reasoning.
- Your
hypothesis should be based on previous research.
- The
hypothesis should follow the most likely outcome, not the exceptional
outcome.
- A good hypothesis
provides a reasonable explanation for the predicted outcome.
- Do
not look for unrealistic explanations.
- A good hypothesis
clearly states the relationship between the defined variables.
- Clear,
simply written hypothesis is easier to test.
- Do
not be vague.
- A good hypothesis
defines the variables in easy to measure terms.
- Who
are the participants?
- What
is different or will be different in your test?
- What
is the effect?
- A good hypothesis
is testable in a reasonable amount of time.
- Do
not plan a test that will take longer than your class project.
TYPES
DESCRIPTIVE
HYPOTHESIS:
Descriptive
hypothesis are propositions that describe the existence, size, form or
distribution of some variables.
RELATIONAL
HYPOTHESIS:
It
describes the relationship between two variables.
WORKING
HYPOTHESIS:
The
working hypothesis indicates the nature of data and methods of analysis required
for the study. Working hypothesis are subject to modification as the
investigation proceeds.
NULL
HYPOTHESIS:
When
a hypothesis is stated negatively, it is called a null hypothesis. A null
hypothesis should always be specific. The null hypothesis is the one which one
wishes to disprove.
ALTERNATIVE
HYPOTHESIS
The
set of alternatives to the null hypothesis is referred to as the alternative
hypothesis. Alternative hypothesis is usually the one which one whishes to
prove.
STATISTICAL
HYPOTHESIS:
It
is a quantitative statement about a population. When the researcher derives
hypothesis from a sample and hopes it to be true for the entire population it
is known as statistical hypothesis.
SIMPLE
HYPOTHESIS:
It
states the existence of certain empirical uniformities. Many empirical
uniformities are common in sociological research.
COMPOSITE
HYPOTHESIS:
These
hypothesis aim at testing the existence of logically derived relationships
between empirical uniformities obtain.
EXPLANATORY
HYPOTHESIS:
It
states the existence of one independent variable causes or leads to an effect
on dependent variable.
PROCEDURE
OF TESTING A HYPOTHESIS:
Making a
formal statement:
Construct a formal statement of
the null hypothesis and also of the alternative hypothesis.
(Eg) Null hypothesis H0
Alternative hypothesis Ha
Selecting
a statistical technique:
There
are many important parametric tests, which are frequently used in hypothesis
testing. They are Z-test, t-test, X2-test, and F-test. The
researcher has to select the appropriate test for his research.
Selecting
the significance level:
The
hypothesis are tested on pre-determined level of significance. In practice,
either 5% level and or 1% level of significance is adopted for accepting or
rejecting a hypothesis.
Choosing
the two-tailed and one-tailed tests:
The
hypothesis indicated whether we should use a one-tailed test or a two-tailed
test. If the alternative hypothesis is of the type greater than or of the type
lesser than, we use a one-tailed test. On the other hand if the alternative
hypothesis is of the type “not equal to” then we use a two-tailed test.
Compute
the appropriate statistics from the sample data:
A
random sample has to be selected as per the sample design decided, and for the
collected data, the appropriate statistic or measure with reference to the
research question, type of hypothesis to be tested and the level of measurement
of the data.
Compute
the significance test value:
After
the sample statistic is calculated, the formula for the selected significance
test is used to obtain the calculated test value.
Obtain
the critical test value:
We
must locate the critical value in the table concerned with the selected
probability distribution for the given level of significance for the
appropriate number of degrees of freedom. The critical value so located in the
table is commonly known as table value.
Deriving
the inference:
The
calculated value is then compared with the predetermined critical value. If the
calculated value exceeds the critical value at 5% level, then the difference is
considered as significant. On the other hand, if the calculated valued is less
than the critical value at 5% level the difference is considered as
insignificant.
Hypothesis Tests
Statisticians follow a formal
process to determine whether to reject a null hypothesis, based on sample data.
This process, called hypothesis testing, consists of four
steps.
- State
the hypotheses. This involves stating the null and alternative hypotheses.
The hypotheses are stated in such a way that they are mutually exclusive.
That is, if one is true, the other must be false.
- Formulate
an analysis plan. The analysis plan describes how to use sample data to
evaluate the null hypothesis. The evaluation often focuses around a single
test statistic.
- Analyze
sample data. Find the value of the test statistic (mean score, proportion,
t-score, z-score, etc.) described in the analysis plan.
- Interpret
results. Apply the decision rule described in the analysis plan. If the
value of the test statistic is unlikely, based on the null hypothesis,
reject the null hypothesis.
Decision Errors
Two types of errors can result
from a hypothesis test.
- Type I error. A Type I error
occurs when the researcher rejects a null hypothesis when it is true. The
probability of committing a Type I error is called the significance
level. This probability is also called alpha,
and is often denoted by α.
- Type II error. A Type II error
occurs when the researcher fails to reject a null hypothesis that is
false. The probability of committing a Type II error is called Beta,
and is often denoted by β. The probability of not committing a Type
II error is called the Power of the test.
Ho true
Ho false
Reject Ho
|
Type I error (a)
|
OK
|
Accept Ho
|
OK
|
Type II error (b)
|
Decision Rules
The analysis plan includes
decision rules for rejecting the null hypothesis. In practice, statisticians
describe these decision rules in two ways - with reference to a P-value or with
reference to a region of acceptance.
·
P-value. The strength of
evidence in support of a null hypothesis is measured by the P-value.
Suppose the test statistic is equal to S. The P-value is the
probability of observing a test statistic as extreme as S, assuming
the null hypothesis is true. If the P-value is less than the significance
level, we reject the null hypothesis.
·
Region of acceptance. The region
of acceptance is a range of values. If the test statistic falls within
the region of acceptance, the null hypothesis is not rejected. The region of
acceptance is defined so that the chance of making a Type I error is equal to
the significance level.
The set of
values outside the region of acceptance is called the region of
rejection. If the test statistic falls within the region of rejection,
the null hypothesis is rejected. In such cases, we say that the hypothesis has
been rejected at the α level of significance.
These approaches are equivalent.
Some statistics texts use the P-value approach; others use the region of
acceptance approach. In subsequent lessons, this tutorial will present examples
that illustrate each approach.
One-Tailed and Two-Tailed Tests
A test of a statistical
hypothesis, where the region of rejection is on only one side of the sampling
distribution, is called a one-tailed test. For example,
suppose the null hypothesis states that the mean is less than or equal to 10.
The alternative hypothesis would be that the mean is greater than 10. The
region of rejection would consist of a range of numbers located on the right
side of sampling distribution; that is, a set of numbers greater than 10.
A test of a statistical
hypothesis, where the region of rejection is on both sides of the sampling
distribution, is called a two-tailed test. For example,
suppose the null hypothesis states that the mean is equal to 10. The
alternative hypothesis would be that the mean is less than 10 or greater than
10. The region of rejection would consist of a range of numbers located on both
sides of sampling distribution; that is, the region of rejection would consist
partly of numbers that were less than 10 and partly of numbers that were
greater than 10.
A General Procedure for Conducting Hypothesis Tests
All hypothesis tests are
conducted the same way. The researcher states a hypothesis to be tested,
formulates an analysis plan, analyzes sample data according to the plan, and
accepts or rejects the null hypothesis, based on results of the analysis.
- State
the hypotheses. Every hypothesis test requires the analyst to state a null
hypothesis
and an alternative
hypothesis.
The hypotheses are stated in such a way that they are mutually exclusive.
That is, if one is true, the other must be false; and vice versa.
- Formulate
an analysis plan. The analysis plan describes how to use sample data to
accept or reject the null hypothesis. It should specify the following
elements.
- Significance
level. Often, researchers choose significance
levels
equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
- Test
method. Typically, the test method involves a test statistic and a sampling
distribution.
Computed from sample data, the test statistic might be a mean score,
proportion, difference between means, difference between proportions,
z-score, t-score, chi-square, etc. Given a test statistic and its
sampling distribution, a researcher can assess probabilities associated
with the test statistic. If the test statistic probability is less than
the significance level, the null hypothesis is rejected.
- Analyze
sample data. Using sample data perform computations called for in the
analysis plan.
- Test
statistic. When the null hypothesis involves a mean or proportion, use
either of the following equations to compute the test statistic.
Test
statistic = (Statistic - Parameter) / (Standard deviation of statistic)
Test statistic = (Statistic - Parameter) / (Standard error of statistic)
Test statistic = (Statistic - Parameter) / (Standard error of statistic)
where Parameter is the value appearing in
the null hypothesis, and Statistic is the point
estimate
of Parameter. As part of the analysis, you may need to compute the
standard deviation or standard error of the statistic. Previously, we presented
common formulas for the
standard deviation and standard error. When the parameter in the null
hypothesis involves categorical data, you may use a chi-square statistic as the
test statistic. Instructions for computing a chi-square test statistic are
presented in the lesson on the chi-square
goodness of fit test.
- P-value.
The P-value is the probability of observing a sample statistic as extreme
as the test statistic, assuming the null hypothesis is true.
- Interpret
the results. If the sample findings are unlikely, given the null
hypothesis, the researcher rejects the null hypothesis. Typically, this
involves comparing the P-value to the significance
level,
and rejecting the null hypothesis when the P-value is less than the
significance level.
Parametric
test:
Parametric methods were developed
on the assumption that the underlying distribution was normal, exponential and
the like. Important parametric tests used for testing the significance are
‘t-test’ ‘f-test’, ‘z-test’ etc., with these tests the observed values, their
distribution, significance and conclusion are drawn on the basis of the nature
and extent of difference between the two.
Non-parametric
tests:
Non-parametric
methods are distribution free methods. Which have no assumption about the
underlying distribution. Hence, it can be used regardless of the shape of
underlying distribution. It is suitable for small sized samples. It can be
applied even in case of nominal scale and ordinal scaled data.
Important
non-parametric test used for testing the significance are median test, wilcoxon
matched-pairs test, chi-square test, Nann-whitney ‘U’ test, kruskal wallis
test, etc.,
Hypothesis
Test of the Mean
This lesson explains how to
conduct a hypothesis test of a mean, when the following conditions are met:
- The
sampling method is simple
random sampling.
- The
sample is drawn from a normal or near-normal population.
Generally, the sampling
distribution will be approximately normally distributed if any of the following
conditions apply.
- The
population distribution is normal.
- The
sampling distribution is symmetric, unimodal, without outliers, and the sample
size is 15 or less.
- The
sampling distribution is moderately skewed, unimodal,
without outliers, and the sample size is between 16 and 40.
- The
sample size is greater than 40, without outliers.
This approach consists of four
steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze
sample data, and (4) interpret results.
State
the Hypotheses
Every
hypothesis test requires the analyst to state a null
hypothesis and an alternative
hypothesis. The hypotheses are stated in such a way that they are mutually
exclusive. That is, if one is true, the other must be false; and vice versa.
The first set
of hypotheses (Set 1) is an example of a two-tailed
test, since an extreme value on either side of the sampling
distribution would cause a researcher to reject the null hypothesis. The
other two sets of hypotheses (Sets 2 and 3) are one-tailed
tests, since an extreme value on only one side of the sampling distribution
would cause a researcher to reject the null hypothesis.
Formulate an Analysis Plan
The analysis plan describes how to use sample data to accept or
reject the null hypothesis. It should specify the following elements.
- Significance
level. Often, researchers choose significance
levels
equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
- Test
method. Use the one-sample
t-test
to determine whether the hypothesized mean differs significantly from the
observed sample mean.
Analyze Sample Data
Using sample data, conduct a one-sample t-test. This involves
finding the standard error, degrees of freedom, test statistic, and the P-value
associated with the test statistic.
- Standard
error. Compute the standard
error
(SE)
of the sampling distribution.
SE = s *
sqrt{ ( 1/n ) * ( 1 - n/N ) * [ N / ( N - 1 ) ] }
where s is the standard deviation of the
sample, N is the population size, and n is the sample size. When the
population size is much larger (at least 10 times larger) than the sample size,
the standard error can be approximated by:
SE = s /
sqrt( n )
- Degrees
of freedom. The degrees of freedom (DF) is equal to the sample size (n)
minus one.
Thus, DF = n - 1.
- Test
statistic. The test statistic is a t-score (t) defined by the following
equation.
t = (x - μ) / SE
where x is the sample mean,
μ is the hypothesized population mean in the null hypothesis, and SE is the
standard error.
- P-value. The
P-value is the probability of observing a sample statistic as extreme as
the test statistic. Since the test statistic is a t-score, use the t Distribution
Calculator
to assess the probability associated with the t-score, given the degrees
of freedom computed above. (See sample problems at the end of this lesson
for examples of how this is done.)
Interpret Results
If the sample findings are unlikely, given the null hypothesis, the
researcher rejects the null hypothesis. Typically, this involves comparing the
P-value to the significance
level, and rejecting the null hypothesis when the P-value is less than the
significance level.
An inventor has developed a new, energy-efficient lawn mower engine.
He claims that the engine will run continuously for 5 hours (300 minutes) on a
single gallon of regular gasoline. Suppose a simple random sample of 50 engines
is tested. The engines run for an average of 295 minutes, with a standard
deviation of 20 minutes. Test the null hypothesis that the mean run time is 300
minutes against the alternative hypothesis that the mean run time is not 300
minutes. Use a 0.05 level of significance. (Assume that run times for the
population of engines are normally distributed.)
Solution: The solution to this problem takes four steps: (1) state the
hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4)
interpret results. We work through those steps below:
·
State the hypotheses. The first step is to state the null hypothesis and an alternative
hypothesis.
Null
hypothesis: μ = 300
Alternative hypothesis: μ ≠ 300
Alternative hypothesis: μ ≠ 300
Note that
these hypotheses constitute a two-tailed test. The null hypothesis will be
rejected if the sample mean is too big or if it is too small.
·
Formulate an analysis
plan. For this analysis, the significance
level is 0.05. The test method is a one-sample
t-test.
·
Analyze sample data. Using sample data, we compute the standard error (SE), degrees of
freedom (DF), and the t-score test statistic (t).
SE
= s / sqrt(n) = 20 / sqrt(50) = 20/7.07 = 2.83
DF = n - 1 = 50 - 1 = 49
t = (x - μ) / SE = (295 - 300)/2.83 = 1.77
DF = n - 1 = 50 - 1 = 49
t = (x - μ) / SE = (295 - 300)/2.83 = 1.77
where s is
the standard deviation of the sample, x is the sample
mean, μ is the hypothesized population mean, and n is the sample size.
Since we have
a two-tailed
test, the P-value is the probability that the t-score having 49 degrees of
freedom is less than -1.77 or greater than 1.77.
We use the t Distribution Calculator to find
P(t < -1.77) = 0.04, and P(t > 1.75) = 0.04. Thus, the P-value = 0.04 +
0.04 = 0.08.
·
Interpret results. Since the P-value (0.08) is greater than the significance level
(0.05), we cannot reject the null hypothesis.
Problem 2:
One-Tailed Test
Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01.
Bon Air Elementary School has 300 students. The principal of the school thinks that the average IQ of students at Bon Air is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the sampled students, the average IQ is 108 with a standard deviation of 10. Based on these results, should the principal accept or reject her original hypothesis? Assume a significance level of 0.01.
Solution: The solution to this problem takes four steps: (1) state the
hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4)
interpret results. We work through those steps below:
·
State the hypotheses. The first step is to state the null hypothesis and an alternative
hypothesis.
Null
hypothesis: μ >= 110
Alternative hypothesis: μ < 110
Alternative hypothesis: μ < 110
Note that
these hypotheses constitute a one-tailed test. The null hypothesis will be
rejected if the sample mean is too small.
·
Formulate an analysis
plan. For this analysis, the significance
level is 0.01. The test method is a one-sample
t-test.
·
Analyze sample data. Using sample data, we compute the standard error (SE), degrees of
freedom (DF), and the t-score test statistic (t).
SE = s / sqrt(n) =
10 / sqrt(20) = 10/4.472 = 2.236
DF = n - 1 = 20 - 1 = 19
t = (x - μ) / SE = (108 - 110)/2.236 = -0.894
DF = n - 1 = 20 - 1 = 19
t = (x - μ) / SE = (108 - 110)/2.236 = -0.894
where s is
the standard deviation of the sample, x is the sample
mean, μ is the hypothesized population mean, and n is the sample size.
Since we have
a one-tailed
test, the P-value is the probability that the t-score having 19 degrees of
freedom is less than -0.894.
We use the t Distribution Calculator to find
P(t < -0.894) = 0.19. Thus, the P-value is 0.19.
·
Interpret results. Since the P-value (0.19) is greater than the significance level
(0.01), we cannot reject the null hypothesis.
Conclusion
An
hypothesis is a specific statement of prediction. It describes in concrete
terms what you expect will happen in your study. Not all studies have
hypotheses. Sometimes a study is designed to be exploratory. There is no formal
hypothesis, and perhaps the purpose of the study is to explore some area more
thoroughly in order to develop some specific hypothesis or prediction that can
be tested in future research. A single study may have one or many hypotheses.
No comments:
Post a Comment