Glossary
I intend to add/revise this glossary as the semester progresses. If you have a question or would like a term added, please send me an email.
- Algorithm: a set of instructions that solve a problem in a finite set of time.
- Bit: A binary digit, either a 0 or a 1. The most basic measurement of digital data.
- Byte: A collection of 8 bits. Most data is measured in terms of bytes. Frequently combined with the metric prefix system to produce units such as Kilobytes, Megabytes...
- CSV: Comma Separated Values. This is a common file format for storing tabular data. Each line normally represents a record. The fields in a record are separated by commas. If the data contains commas, alternative delimiters are sometimes selected or additional encoding techniques are used. Many tools for manipulating data can read files in a CSV format.
- Data: Data is information output by a sensing device or organ that includes both useful and irrelevant or redundant information and must be processed to be meaningful (From Merriam Webster)
- Data Mining: The process of searching data sets for patterns or information.
- Delimiter: A special character used to separate data elements. In a CSV file the delimiter is frequently a comma.
- Field: A field is a piece of data in a record.
- Information: Information is data that has been processed, organized, structured or presented to make them meaningful or useful.
- Machine Learning: The use of algorithms and data to produce a system which can perform a task whithout using explicit instructions.
- Record: A record is a collection of data about an individual element of the population. This is usually a single row in a tabular data set.
- Structured Data: Structured data is generally considered data that is broken into a number of regular fields for each record.
- Unstructured Data: Unstructured data is data that can not be broken into fields easily.