Programming project: Difference between revisions

From info319
No edit summary
 
(9 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The project shall develop an emergency management application using big data technologies and social media data. Development and run-time
The project shall develop an application that uses big data technologies on social-media and/or other open data data. At least a part of the project shall use Spark and run in the NREC cloud. The project should be carried out in groups of three, and never more. Working individually or in pairs is possible, but not recommended.  
the platform is a free choice, as is a programming language. The project should be carried out in groups of
> three. Working individually or in pairs is not recommended.  


This autumn, we specifically invite projects that are related to emergency management.
This autumn, we specifically invite projects that use ''big data for the news''.


== Finding a project idea ==
More information about possible projects, deadlines, and other requirements will appear here soon.
'''Optional deadline:''' Friday September 21st


I invite informal, non-mandatory suggestions for programming project ideas by Thursday, September 21st, by email to [mailto:vimala.nunavath@uia.no vimala.nunavath@uia.no]. Your ideas do not have to be elaborate, but it would be great if you have a (non-binding) group on its feet by then! There will be a mandatory deadline for project proposals later.
== Proposing a theme: deadline ==
Everyone who intends to take the course must be included in a project proposal sent by email to [mailto:Andreas.Opdahl@uib.no Andreas.Opdahl@uib.no] and with all the group members on Cc. The subject line must contain the string "INFO319 Project Proposal".


== Programming-project proposals ==
'''Proposal deadline:''' Wednesday October 12th 1500
'''Deadline:''' Wednesday October 10th 1400
<!-- The proposal does not have to be long, but the following points must be made clear:
 
The deadline for written project proposals is Thursday, October 5th 1400. Everyone who intends to take the course must send me a proposal by then, by email to [mailto:vimala.nunavath@uib.no vimala.nunavath@uib.no].
 
Student project has to be in a group consist of not more than 4 members. Send me a single email from each group, with Cc to all group members.
 
The proposal does not have to be long, but the following points must be made clear:
*    What you are planning to make using big data and big data technologies.
*    What you are planning to make using big data and big data technologies.
*    Why it is a good idea to use big data and big data technologies for this purpose.
*    Why it is a good idea to use big data and big data technologies for this purpose.
Line 23: Line 15:
*    What you have done to ensure that something very similar has not been done before.
*    What you have done to ensure that something very similar has not been done before.
*    Which datasets you are planning to use.
*    Which datasets you are planning to use.
*    What technologies (programming language, libraries, development and collaboration tools) you are planning to use.
*    What technologies (programming language, libraries, development and collaboration tools) you are planning to use. -->
 
(These points are particularly important if you want to do a project based on your own ideas. If you chose one of the suggests project themes, some of the points may, of course, be given.)


== Project presentations ==
== Project presentations ==
The final project presentations: Friday, November 23rd.
'''Final project presentations:''' Thursday December 8th 1015


Depending a little on the number of project groups, each presentation will be brief: 20 minutes for each group.
Depending a little on the number of project groups, each presentation will be brief: 15 minutes for each group + 5 minutes for questions and comments.


You may demonstrate your project live (most convincing), or you may replay a recorded demonstration (which is good to have as a backup in any case). In addition, I expect each presentation to address/answer at least these points:
You may demonstrate your project live (most convincing), or you may replay a recorded demonstration (which is good to have as a backup in any case). In addition, I expect each presentation to address/answer at least these points:
*    what have you made? - or: what is your application doing?
*    what have you made? - or: what is your application doing?
*    which technologies have you used (languages, libraries, IDEs etc.)
*    which technologies have you used (languages, APIs, other software etc.)
*    which datasets have you used? - or: where did you get your data from?  
*    where did you get your data from? - and/or: which datasets have you used?
*    why is it a good idea to do this using big data and big data technologies? - or: what does your app do that was not possible (or at least not easy) to do before?
*    why is it a good idea to do this using big data and big data technologies? - or: what does your system do that was not possible (or at least not easy) to do before?
*    exactly what have you done and programmed so far?
*    exactly what have you done and programmed so far?
*    what are you planning to do in future?
*    what are you planning to do in the final few days?
*    have you got any particular problems you need to address?
*    have you got any particular problems you need to address?


== Final project submission ==
== Project submission ==
'''Hard deadline:''' November 27th 1400
'''Final project submission:''' December 12th.


Submit your project through Inspera as a single ZIP archive. The version of your project that you submit should be anonymous.
''Since the project is graded, this is an official deadline. If you do not submit on time, you will be not allowed to take the course exam a week later.''
Provide a short video (max 5 minutes) that shows your system running, which voice comments.
Comment your code sparsely and in-line. You do not need additional documentation, but you should provide a precise description for how to run your system. For example, explain:
* which additional packages that need to be installed
* which datasets that need to be downloaded
** do not include large datasets >10M in your Zip file
** but it is fine to include smaller test datasets
* if credentials (like a Twitter token) is needed to run the code, explain where they must be added
* which other systems that must be running first (e.g., Kafka, HDFS, YARN)
* how to start your system (in particular if it consists of several programs)
<!--
The end result of the project should be submitted as a ZIP archive through Inspera:
The end result of the project should be submitted as a ZIP archive through Inspera:
*    Just one person in the group shall deliver the group assignment (ZIP file) in Inspera.
*    Just one person in the group shall deliver the group assignment (ZIP file) in Inspera.
Line 54: Line 60:
The submitted ZIP archive should contain your complete project in a single directory. Include a file README.TXT in the root of the project directory to let us know if you have used a particular development environment (like Eclipse), what is needed before your code can be run, and how to run it, and if there are other things to do.
The submitted ZIP archive should contain your complete project in a single directory. Include a file README.TXT in the root of the project directory to let us know if you have used a particular development environment (like Eclipse), what is needed before your code can be run, and how to run it, and if there are other things to do.


The file name of your archive should contain the student numbers of everyone in the group. (Not that your student number is different from your student card number...). In addition to your code, the ZIP archive should contain a 8-10 page description of your project (see below). You can also include the data sets to run the code on.
The file name of your archive should contain the student numbers of everyone in the group. (Not that your student number is different from your student card number...).  
 
== Submitted code ==
You should comment the code lightly in-line. This does not mean one comment line for each code line but, e.g., one brief comment for each logical group of code lines. In particular, we do not require full Javadoc or similar.


== Project description ==
Your ZIP archive should contain a 2-page project description. Put this description in the root folder of your project directory before you ZIP it. The project description file should be anonymous, and contain the exam numbers of all group members, BOTH on the first page and in the file name (e.g., ProjectDescription_102_113.pdf .)
Your ZIP archive should contain a 8-10 page project description. Put this description in the root folder of your project directory before you ZIP it. The project description file should be anonymous, and contain the exam numbers of all group members, BOTH on the first page and in the file name (e.g., ProjectDescription_102_113.pdf .)


The length of the project description is max 8-10 A4 pages with 11pt font and 2.5 cm margins. This is a HARD limit. You can have appendices, though, and any figures or tables come in addition to the two pages. The quality of your code is more important than the quality of the 8-10 page description. You receive a grade on the project, not on the report.
The length of the project description is max 2 A4 pages with 11pt font and 2.5 cm margins. This is a HARD limit. You can have appendices, though, and any figures or tables come in addition to the two pages. The quality of your code is more important than the quality of the 2 page description. You receive a grade on the project, not on the report.


* You should briefly explain the purpose of your system. Why have you made this? Why is it a good idea to do this using big data technologies? What can you do now that wasn't possible before?
* You should briefly explain the purpose of your system. Why have you made this? Why is it a good idea to do this using big data technologies? What can you do now that wasn't possible before?
Line 70: Line 72:
* You should mention any particular problems you have had and/or things you want to do differently next time.
* You should mention any particular problems you have had and/or things you want to do differently next time.
* If you want to briefly describe how to run the code you have submitted, you can do that separately in a README.TXT file.
* If you want to briefly describe how to run the code you have submitted, you can do that separately in a README.TXT file.
-->

Latest revision as of 16:53, 15 November 2022

The project shall develop an application that uses big data technologies on social-media and/or other open data data. At least a part of the project shall use Spark and run in the NREC cloud. The project should be carried out in groups of three, and never more. Working individually or in pairs is possible, but not recommended.

This autumn, we specifically invite projects that use big data for the news.

More information about possible projects, deadlines, and other requirements will appear here soon.

Proposing a theme: deadline

Everyone who intends to take the course must be included in a project proposal sent by email to Andreas.Opdahl@uib.no and with all the group members on Cc. The subject line must contain the string "INFO319 Project Proposal".

Proposal deadline: Wednesday October 12th 1500

Project presentations

Final project presentations: Thursday December 8th 1015

Depending a little on the number of project groups, each presentation will be brief: 15 minutes for each group + 5 minutes for questions and comments.

You may demonstrate your project live (most convincing), or you may replay a recorded demonstration (which is good to have as a backup in any case). In addition, I expect each presentation to address/answer at least these points:

  • what have you made? - or: what is your application doing?
  • which technologies have you used (languages, APIs, other software etc.)
  • where did you get your data from? - and/or: which datasets have you used?
  • why is it a good idea to do this using big data and big data technologies? - or: what does your system do that was not possible (or at least not easy) to do before?
  • exactly what have you done and programmed so far?
  • what are you planning to do in the final few days?
  • have you got any particular problems you need to address?

Project submission

Final project submission: December 12th.

Submit your project through Inspera as a single ZIP archive. The version of your project that you submit should be anonymous.

Since the project is graded, this is an official deadline. If you do not submit on time, you will be not allowed to take the course exam a week later.

Provide a short video (max 5 minutes) that shows your system running, which voice comments.

Comment your code sparsely and in-line. You do not need additional documentation, but you should provide a precise description for how to run your system. For example, explain:

  • which additional packages that need to be installed
  • which datasets that need to be downloaded
    • do not include large datasets >10M in your Zip file
    • but it is fine to include smaller test datasets
  • if credentials (like a Twitter token) is needed to run the code, explain where they must be added
  • which other systems that must be running first (e.g., Kafka, HDFS, YARN)
  • how to start your system (in particular if it consists of several programs)