Aims and objectives
This module will:
- explain what data is
- examine how data is used
- explore ways to analyse data.
After completing this module, you will be able to:
- find, clean and use data
- evaluate and select tools to analyse and display data.
3. Sources of data
Plan your analysis first
The question you want answered, or the problem you want to solve, will determine the type of data you need and analysis you need to do on that data. You can then use existing data, or gather the data yourself.
Who, what, when, where?
Think about:
- Who or what — is the subject of the data? e.g. a group of people, food, animals etc
- Where — is the location important? e.g. local or international, a particular suburb
- When — what is the appropriate time period to look at?
There may be an existing dataset that you can analyse to answer your question. Otherwise, you can use methods to gather your own data.
Open data
Open data is publicly available for reuse. Open data can be accessed from a range of sources. High quality data can be found on government websites and institutional repositories:
- Research data — this guide lists a range of high quality public research data sources
- Text data — this guide lists a variety of sources for open text data
- Spatial data — this guide lists Queensland, Australian and Global spatial data sources
- UQ eSpace — the repository for UQ research publications and research datasets
- Google Dataset search — allows searching across multiple repositories. Limits you can apply include download format and usage rights.
GovHack is an annual event, where people are invited to apply their creative skills to open government data.
Do you want to use Census data from the Australian Bureau of Statistics (ABS) in your assignment or project?
TableBuilder is a free, online data tool, from the ABS, for creating tables, graphs and maps of Census data. Basic tables (YouTube, 4m44s) shows how to select data sets and create basic tables in TableBuilder:
Dataset quality
Evaluate the quality of the dataset, just as you would evaluate any information you find, before you use it in your assignments or projects. Click the plus symbols to find out what you need to consider when assessing datasets:
The metadata or description of the data should include information to help you evaluate the dataset.
Check how the Additional information allows you to easily evaluate the quality of this dataset — Australia's threatened species, life history characteristics, and threatening processes.
Collecting data
You may need to collect your own data to answer your research question, if existing data is not available or suitable.
Storing data
If you collect the data yourself, you will need to think about where to store it.
If you are collecting data for a small project or assignment, using local or online storage, such as Google Drive or OneDrive, would be suitable, as long as you do not have any identifiable, personal data. The Working with files module gives an overview of local and online storage options and how to back up your files.
If you are conducting research as part of your Winter/Summer Research project, Honours, Masters, or Higher Degree by Research degree:
- See Manage research data to find out how to manage, store and secure your project’s data.
- Discuss with your supervisor/coordinator the suitability of using the UQ Research Data Manager to manage your research data.
Sample size
Decide how many responses or observations you need to have a good sample size. Larger sample sizes are more likely to allow you to draw accurate conclusions than smaller samples. Get information on sample design from the Australian Bureau of Statistics.
Methods for collecting data
Observation
In observation, processes, activities or behaviours are observed. The subjects being observed may or may not be aware that they are under observation. A description of what occurs, or a checklist looking for a particular event, is used to record the observation.
Find out more about the observational method.
Surveys or polls
A survey is a method of collecting data on behaviour, attitudes and opinions. Plan your survey questions carefully so the information you get from participants is useful to answer your research question.
A poll is a type of survey but usually quite short. Polls often have only one multiple choice question.
Information on planning a survey:
- Setting up surveys from the Australian Bureau of Statistics
- Types of questions to use and how to word them
- 10 tips for building effective surveys
You can conduct your surveys or polls face-to-face or you can use online tools.
Tool | Free account available | Guides |
---|---|---|
Google Forms | Yes | How to use Google Forms |
Survey Monkey | Yes | Help Centre |
Crowd Signal | Yes | Help page |
Interviews or focus groups
An interview typically involves asking either structured or unstructured questions of a single participant. Usually, the questions will be open-ended to allow for more in-depth insight on a topic than a survey can give.
A focus group involves a group of selected people (usually 6 to 12 individuals) participating in a group interview, guided by a moderator. It is a good way to get a social context on a topic.
In both techniques, you may need an audio or video record of the discussion, or have an observer record the details.
See some factors to consider when deciding whether to use focus groups or interviews.
Scraping
Scraping is a method of getting text and images from websites and social media. The practice can be problematic, depending upon the amount of reproduction and the intended use. Researchers may use automated methods to access any publicly available information on the web where they are specifically engaged in legitimate research and study that makes use of the data and where they do not further publish the material. Get more infomation or advice on copyright.
The Finding and using media module provides information on how to comply with copyright.
Web APIs
Web APIs can be used to request data from a site using a URL. API stands for Application Programming Interface. Usually, some programming knowledge is needed to use APIs. Web APIs for non-programmers explains how web APIs work, gives tips on using them and lists some popular free APIs.
More tools for scraping
Tool | Freely available | Available on UQ Library computers | Guides |
---|---|---|---|
Python | Yes | Yes | Downloading webpages with Python |
Scrapy | Yes | No | Scrapy documentation |
Reaper (for social media) | Yes | No | Help information is available on the Reaper site. |
NCapture (a Chrome extension used with NVivo) | Yes | NVivo is available on computers in selected locations- see Additonal software in Zenworks. Find information about NVivo in the Analyse and display data section. | What is NCapture? Find out more about NVivo in the next section. |
Interested in Instagram data? See a comparison of Instagram scraping tools.
See more methods for gathering social media data in our Text mining & text analysis guide. Text mining is a way of extracting text data from documents, for analysis. See Text mining 101 for a quick overview.
Duration: Approximately 30 minutes
Graduate attributes
Knowledge and skills you can gain to contribute to your Graduate attributes:
Independence and creativity
Critical judgement
Ethical and social understanding
Check your knowledge
Check what you know about this topic:
Support at UQ
Access UQ services to assist you with personal or study-related issues.