Project Prototype

Due: Saturday, November 4th

General Directions

The prototype deliverable is intended to demonstrate a proof of concept for your final project report. Large multi-week projects are challenging — this deliverable is intended to provide additional structure to ensure you are making substantial progress and are on a path toward success, as well as to get any help your team may need.

It consists of a written report detailed below, along with any accompanying data, code, or other supplementary resources that demonstrate your progress so far in the project. You can think of it as a rough draft for your final project. The report should stand on its own so that it makes sense to someone who has not read your proposal.

The report should contain at least five parts, which we define below. In terms of length, it should be 3-4 pages using standard margins (1 in.), font (11-12 pt), and line spacing (1-1.5). A typical submission is around 2-3 pages of text and 3-4 pages overall with tables and figures. You should convert your written report to a pdf and upload it to Gradescope under the assignment “Project Prototype” by the due date. Be sure to include your names and NetIDs in your final document and use the group submission feature on Gradescope. You do not need to upload your accompanying data, code, or other supplemental resources demonstrating your work to Gradescope; instead, your report should contain instructions on how to access these resources (see parts 2 and 4 below for more details). For a demo/example prototype, please navigate to the course box folder.

  • E (Exemplary, 30pts) – Work that meets all requirements.
  • S (Satisfactory, 29 pts) – Work that meets all requirements but is over 4 pages OR is missing the NetIDs.
  • N (Not yet, 18pts) – Does not meet all requirements.
  • U (Unassessable, 6pts) –  Missing at least one section.

Part 1: Introduction and Research Questions (15 points)

Your prototype report should begin by reintroducing your topic and restating your research question(s) as in your proposal. Your research question(s) should be (1) substantial, (2) feasible, and (3) relevant. Briefly justify each of these points as in the project proposal. You can start with the text from your proposal, but you should update your introduction and research questions to reflect changes in or refinements of the project vision. Specifically, point out what has changed since the proposal or if there are no changes. Your introduction should be sufficient to provide context for the rest of your report.

Grading

  • E (Exemplary, 15pts) – Comprehensive introduction with clearly labeled, updated research questions and a justification for the research questions about whether they are substantial, feasible, and relevant. Any changes are specifically mentioned or they note there are no changes.
  • S (Satisfactory, 14pts) – Comprehensive introduction with clearly labeled research questions and a justification for the research questions about whether they are substantial, feasible, and relevant. Changes and updates may not be specifically mentioned.
  • N (Not yet, 9pts) – Incomplete introduction where the research questions or justification are missing pieces, but at least some of it is present. Or the justification is clearly not reasonable.
  • U (Unassessable, 3pts) – Incomplete introduction where it is entirely missing the research questions or justification or does not demonstrate meaningful effort.

Part 2: Data Sources (15 points)

After your introduction and research questions, your prototype should discuss the data you have collected and are using to answer your research questions. Be specific: name the datasets you are using and where they were collected from / how they were prepared. Briefly justify why your data are appropriate and sufficient to address your research questions. As in the introduction, you can begin with the text from your proposal but be sure to update it to fit your evolving project.

Grading

  • E (Exemplary, 15pts) – Origins of data are properly specified, cited, and relevant to answering the research question(s). If any data wrangling, cleaning, or other data preparation was done, these processes are explained.
  • S (Satisfactory, 14pts) – Origins of data are properly specified and cited. However, the justification is not clear why the data is relevant to the proposed research question(s). If any data wrangling, cleaning, or other data preparation was done, these processes are explained.
  • N (Not yet, 9pts) – Poorly specified data sources or the justification for using that data set or the methods to acquire the data is lacking. No discussion of preparing the dataset.
  • U (Unassessable, 3pts) – Data sources or methods to acquire data are missing or do not demonstrate meaningful effort.

Part 3: What Modules are You Using? (15 points)

Your project should utilize concepts from modules we have/will cover in this course to answer your research question(s). We will assume you will use modules 1 (Python), 2 (Numpy/Pandas), and 5 (Probability). This section should state at least 3 more modules that you will utilize for your project. Each module should have a short description of how you will use the knowledge in this module and a justification for that use. In addition, include what concepts from the module you will use and at what stage of your project you plan to mostly use this module. Potential stages include, but are not limited to: data gathering, data cleaning, data investigation, data analysis, and final report.

  • Module 3: Visualization
  • Module 4: Data Wrangling
  • Module 6: Combining Data
  • Module 7: Statistical Inference
  • Module 8: Prediction & Supervised Machine Learning
  • Module 9: Databases and SQL
  • Module 10: Deep Learning

As in Part 1 and 2, you can begin with the text from your proposal but be sure to update it to fit with your evolving project. You should add any additional modules you will be using and update the existing modules to be more specific to the different tasks and stages of your projects.

Grading

  • E (Exemplary, 15pts) – States at least 3 modules. For each module, they provide an updated (1) short description of how they will use the module, (2) justification for using this module, (3) what concepts they will likely use, and (4) what stage they expect they will use it. 
  • S (Satisfactory, 14pts) – States at least 3 modules, but there are some weaknesses somewhere, such as one module as 3 or more parts not well fleshed out or across all 3 modules one part is weak.
  • N (Not yet, 9pts) – States at 3 modules, but 3 or more parts are entirely missing or basically non-existent out of 12 = 4 parts X 3 modules.
  • U (Unassessable, 3pts) – Does not meet the Not Yet criteria, such as having fewer than 3 modules or missing more than 3 parts across all 12 = 4 parts X 3 modules.

Example:

Here is an example of the proposal versus the prototype justification for Module 5 (Probability), note the differences. Assume the project is about creating a prediction model that is classifying the data. Remember that this module is not on the list of modules to count as one of your 3, but you are welcome to include analysis using concepts from it. Note the bolding, which will help you ensure you are meeting all requirements and your grader to find them.

Proposal

Module 5 Probability: We will use this module to calculate the accuracy of a baseline version of the model we will build. We will do this by considering the proportion of the label we are trying to predict, as well as taking into account some of the independent variables. Our justification is that we need a baseline accuracy to understand how good our model is. The concepts we will mainly use are the probability axioms and maybe some of Bayes or marginalization to calculate this baseline. We plan to use this module during the data analysis and final report stage.

Prototype

Module 5 Probability: We used this module to calculate the accuracy of a baseline version of a model we will build to predict the type of a Pokemon. We did this by considering the proportion of each type of a Pokemon in our data set and creating a baseline model that just predicted the most common pokemon in our data set. Our justification is that we need a baseline accuracy to understand how good our model is for predicting the type of a Pokemon based on other characteristics. The concepts we mainly used were the probability axioms and some of Bayes or marginalization to consider if there was a better baseline model we could use. We used this module during our data analysis and plan to use it in the final report stage.

Part 4: Preliminary Results and Methods (15 points)

The preliminary results section of your report should summarize the results obtained so far in the project. Where possible, results should be summarized using clearly labeled tables or figures and supplemented with a written explanation of the significance of the results with respect to the research questions outlined in the previous section. Please note that a screenshot of your dataset does not count as a table or figure and should not be included in your Prototype (i.e. don’t screenshot the dataframe itself, try to come up with some preliminary visualizations). Instead, if your primary progress is gathering and cleaning your data, provide a table with descriptive statistics about your data. Your results do not need to be final or conclusive for your entire project but should demonstrate substantial effort and progress and should provide concrete proof of concept or initial analysis with respect to your research questions.

Your results should be specific about exactly what data were used and how the results were generated. For example, if you scraped multiple web databases, merged them, and created a visualization, then you should explain how each step was conducted in enough detail that an informed reader could reasonably be expected to reproduce your results with time and effort. Just saying, “we cleaned the data and dealt with missing values,” is not sufficient detail, for example.

Your report itself should include an explanation of your methods, but it should also contain instructions on how to access your full implementation (that is, your code, data, and any other supplemental resources like additional charts or tables). The simplest way to do so is to include a link to the box folder, GitLab repo, or whatever other platform your group is using to house your data and code. Please make sure these links are accessible to users who are not added to the resources directly so UTAs and teaching staff can access these if needed.

Grading

  • E (Exemplary, 15pts) – Preliminary results are thoroughly discussed using labeled tables or figures followed by written descriptions. Specific explanation of how the results were generated and from what data. Link to code/data to create charts or visualizations is provided. 
  • S (Satisfactory, 14pts) – Preliminary results are thoroughly discussed using labeled tables or figures followed by written descriptions. Explanation of how the results were generated may lack some specification or it is somewhat unclear as to what data the results are from. Link provided.
  • N (Not yet, 9pts) – Preliminary results are discussed using tables with missing labels or lacking written descriptions. It is unclear how the results were generated and from what data.
  • U (Unassessable, 3pts) – Preliminary results are missing or do demonstrate meaningful effort.

Part 5: Reflection and Next Steps (10 points)

In this part, you should answer the following sections in their own subsection (if space is limited, how you create the clear subsections is up to you):

  1. Successes/Mostly Complete – What has been successful in the project so far or what is essentially complete and ready for the final report?
  2. Challenges/Incomplete – What has been challenging in the project so far or what is incomplete in the prototype that needs to be finished for the final report?
  3. Collaboration plan reflection – How is the collaboration going? What is currently happening versus the original proposed plan? Is the group okay with what is happening? Does the group need to renegotiate what the plan should be? If yes, what is the new plan?
  4. Next Steps – What are your next steps? These should be concrete and specific actions that your group will take to address the challenges identified in order to complete a successful final project.

Grading

  • E (Exemplary, 10pts) – All four parts are present and the reflection is comprehensive on successes and challenges so far, a reflection on their collaboration plan, and a specific plan of action to address any concerns and future work.
  • S (Satisfactory, 9pts) – All four parts are present and the reflection is comprehensive on successes and challenges so far, but the collaboration plan is weak and there is only a loose plan of action to address any concerns and future work.
  • N (Not yet, 6pts) – A reflection/plan that does not entirely answer 1 or 2 of the questions above.
  • U (Unassessable, 2pts) – A reflection/plan that does not entirely answer 3 of the questions above.

Checklist Before You Submit:

  1. Does your prototype satisfy all general directions?
    1. 3-4 pages in length
    2. Standard margins (1 in.)
    3. Font size is 11-12 pt
    4. Line spacing is 1-1.5
    5. Final document is a pdf
  2. Do you have an Introduction and clearly stated Research Question(s)?
    1. Do you feel as if this part meets the requirements of E (Exemplary) or S (Satisfactory)?
  3. Have you properly specified/cited one or more specific Data Sources and justified why they are relevant to the research Questions?
    1. Do you feel as if this part meets the requirements of E (Exemplary) or S (Satisfactory)?
  4. Did you state at least 3 Modules to be used and how, as well as a justification of which concepts will be used at specific stages of the project?
    1. Do you feel as if this part meets the requirements of E (Exemplary) or S (Satisfactory)?
  5. Have you reported all of your Preliminary Results and Methods, including a specific explanation of how the results were generated?
    1. Do you feel as if this part meets the requirements of E (Exemplary) or S (Satisfactory)?
  6. Have you written a comprehensive reflection?
    1. Do you feel as if this part meets the requirements of E (Exemplary) or S (Satisfactory)?