Project: Initial Plan

Due: Saturday, Sept. 23rd

General Directions

The purpose of this document is to ensure that your group is choosing a substantial research project topic that is interesting and worthwhile. You will be working on the collaborative final project for a large portion of this course, and will use this deliverable to brainstorm project ideas and plan how your team will collaborate. In terms of length, it should be 1-2 pages (not including the appendix) using standard margins (1 in.), font (11-12 pt), and line spacing (1-1.5). You should convert your final document to a pdf and upload it to Gradescope under the assignment “Initial Plan” by the due date. Be sure to include your names and NetIDs in your final document and use the group submission feature on Gradescope to include all of your group members on a single submission.

The Initial Plan is out of 100 points. Meeting basic formatting requirements is worth 40 points and will be graded as follows:

  • E (Exemplary, 40pts) – Work that meets all requirements.
  • N (Not yet, 24pts) – Does not meet all requirements.
  • U (Unassessable, 8pts) –  Missing at least one section.

Part 1: Brainstorming (40 points)

To brainstorm ideas for your research topic, you may use one of two options:

  1. Mind map of potential project ideas.
  2. Discussion with ChatGPT or LLM of your choice.

 

For the mind map, you can use an online tool, Google drawing, whiteboard, post-it notes, etc. Just ensure you can put it in your report. To create your mind map, use the following steps:

  1. Put a central idea or main concept in the center, such as “data science research project” or something more specific that your group finds interesting.
  2. Branch out from the main with ideas that can cover a range from interesting topics to previous project ideas that caught your group’s attention.
  3. Branch off of those ideas to add more specific interests or personalized ways you would change a topic or project.
  4. Put your mind map (if it’s on something like a physical whiteboard take a picture) as an appendix in this submission.

 

For the discussion with an LLM, do the following:

  1. Tell the LLM you are brainstorming for data science projects, what your group’s interests are that could be potential sources of data, and that you need to find the data yourself.
  2. Ask it what ideas it has for your project.
  3. Tell it what ideas you liked, didn’t like, why a suggestion isn’t a good one, etc.
  4. Do at least 2-3 rounds of steps 2 and 3 with the LLM.
  5. Put your chat as an appendix in this submission.

 

After your brainstorm, reflect by answering the following questions:

  1. Why did you choose the method you used?
  2. What patterns do you see in what you find interesting?
  3. What research topics or questions did your group generate from this brainstorming? Which of these ideas can you see your group potentially pursuing?
  4. Do you feel like more brainstorming is needed before you find a topic?
  5. If you used
    1. The mindmap: Did you find your brainstorming narrowing or diverging as you discuss ideas to write down?
    2. LLM: How satisfied were you with its answers? Why?

Whether you choose to create a mind map or use an LLM, use this exercise to brainstorm project ideas that your group collectively believes are interesting, relevant, and worthwhile to your time in this course.

Grading

  • E (Exemplary, 40pts) – Appendix has a mind map that branches out at least two levels from the center OR an LLM conversation. In addition, has a reflection that answers all 5 questions.
  • S (Satisfactory, 39pts) – Appendix has a mind map that branches out at least two levels from the center OR an LLM conversation. In addition, has a reflection that mostly answers all 5 questions.
  • N (Not yet, 24pts) –  A brainstorm that does not entirely answer 1 or 2 of the questions. Reflection does not entirely answer at least 1 of the questions.
  • U (Unassessable, 8pts) – Work that does not entirely answer 3 or more of the questions above for either the brainstorm or the reflection.

Part 2: Collaboration Plan (20 points)

This is a collaborative course project pursued by a team of students who bring different strengths and interests to the table. This reflects the reality that significant real-world projects in data science are almost always pursued by teams. For the collaboration to be successful, it helps to establish some guidelines that serve as a starting point. Your collaboration plan should address the following:

  1. How will you divide responsibilities? Will some students be responsible for certain portions of the project, or will you be more integrated and decide on responsibilities on a weekly basis?
  2. About how much time do you expect every group member to spend on the project each week, on average? It is okay if this number is higher toward the last couple of weeks of the semester.
  3. When and how will you meet? You should plan to meet at least once per week for at least 30 minutes to check in on one another’s progress, get help, and plan for what comes next. Identify a day of the week, a time, and the place/platform you will use to meet.
  4. What platform(s) will you use to communicate between meetings? Will you primarily use email, text, Slack, or other chat apps? If you want a more professional enterprise tool, Duke provides free access to Microsoft Teams.
  5. Where will you track who is doing what tasks and when those tasks will be done? This can be as simple as a Google doc with a checklist or as advanced as a Trello board. What is important is there is a clear repository of who is doing what, the status of that thing, and when it should be done.
  6. Where will you store data, code, writing, etc., so that all group members have easy access to shared materials?* Duke provides free access to Box and GitLab, which could serve these purposes, but you could also use external services like Google Drive or GitHub. Provide a link to the folder/repository in your proposal to demonstrate that it is created and ready.

* In addition to a common repository for data, you may find it useful to explore Google colab or DeepNote, which allows you to collaborate on Jupyter Notebooks and execute them in the cloud (like a Google doc for Jupyter notebooks).

Grading

  • E (Exemplary, 20pts) – Comprehensive plan that answers all 6 questions and includes a link to their folder/repository.
  • S (Satisfactory, 19pts) – Comprehensive plan that mostly answers all 6 questions. The link to their folder/repository could be missing.
  • N (Not yet, 12pts) – A plan that does not entirely answer 1 or 2 of the questions above. Link can be missing.
  • U (Unassessable, 4pts) – A plan that does not entirely answer 3 or more of the questions above.