Sample size is a direct function of what statisticians refer to as Confidence and Power of a test. 'Confidence' is 1-Prob(Type I error), or the probability of the test rejecting the null hypothesis when it should be rejected. That is, the chances of getting a true positive. 'Power' is 1-Prob(Type II error), or the probability of the test retaining the null hypothesis when it should be retained. That is, the chances of getting a true negative. The levels of confidence and power are arbitrary but generally set at 95% and 80% respectively. As sample size gets smaller so too does confidence and power (as long as your margin of error stays the same).
It is also important that sample data be collected at random from the population. That way each unit in the population has equal chance of being selected for the study. This reduces bias in your results. Bias can also lead to Type I and II errors, but it's harder to quantify if you don't know what the bias is.
So there is no real way of avoiding Type I & II errors (unless you take a census of the whole population). But you can reduce error by randomly selecting a large enough sample.
Any kind of data can be collected.
The data collected does not have to be measurable.
Data that is collected may have been collected previously for some reason, or it might have been collected recently. Data is usually collected to show statistics or information about something specific.
Outlier: an observation that is very different from the rest of the data.How does this affect the data: outliers affect data because it means that your calculations might be off which makes it a possibility that more than the outlier is off.
Yes, you can analyze collected data on how colors affect human behavior. Example: In the United States a jail was painted pink and they found the pink seemed to soothe the prisoners much more than if the colors were bright or dull colors such as gray.
The collected data is organized in a fashion so you can determine if the hypothesis is supported.
In continuous grouped data the data is collected continuously and in groups. Data collected is in class intervals the actual data values are not visible.
In Relational database data is collected in the form of tables
Calculations or comparisons made using the collected data
You may be thinking of 'anomaly'.
how is data collected and used for the purpose of national statistics
Primary data is the the data that collected by yourself. While secondary data is those collected by others and to be reused by yourself.