Programming project: Difference between revisions

From info319
No edit summary
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The project shall develop a emergency management application using big data. Development and run-time
The project shall develop an application that uses big data technologies on social-media and/or other open data data. At least a part of the project shall use Spark and run in the NREC cloud. The project should be carried out in groups of three, and never more. Working individually or in pairs is possible, but not recommended.  
platform is free choice, as is programming language. The project should be carried out in groups of
three and not more. Working individually or in pairs is not recommended.  


This autumn, we specifically invite projects that are related to emergency management.
This autumn, we specifically invite projects that use ''big data for the news''.


== Finding a project idea ==
More information about possible projects, deadlines, and other requirements will appear here soon.
'''Optional deadline:''' Thursday September 21st


I invite informal, non-mandatory suggestions for programming project ideas by Thursday September 21st, by email to [mailto:vimala.nunavath@uia.no vimala.nunavath@uia.no]. Your ideas do not have to be elaborate, but it would be great if you have a (non-binding) group on its feet by then! There will be a mandatory deadline for project proposals later.
== Proposing a theme: deadline ==
Everyone who intends to take the course must be included in a project proposal sent by email to [mailto:Andreas.Opdahl@uib.no Andreas.Opdahl@uib.no] and with all the group members on Cc. The subject line must contain the string "INFO319 Project Proposal".


== Programming-project proposals ==
'''Proposal deadline:''' Wednesday October 12th 1500
'''Deadline:''' Wednesday October 10th 1400
<!-- The proposal does not have to be long, but the following points must be made clear:
 
The deadline for written project proposals is Wednesday October 10th 1400. Everyone who intends to take the course must send me a proposal by then, by email to [mailto:vimala.nunavath@uib.no vimala.nunavath@uib.no].
 
At this time, you must have decided whether to work alone or in a group. I only need a single email from each group, with Cc to all group members.
 
The proposal does not have to be long, but the following points must be made clear:
*    What you are planning to make using big data and big data technologies.
*    What you are planning to make using big data and big data technologies.
*    Why it is a good idea to use big data and big data technologies for this purpose.
*    Why it is a good idea to use big data and big data technologies for this purpose.
Line 23: Line 15:
*    What you have done to ensure that something very similar has not been done before.
*    What you have done to ensure that something very similar has not been done before.
*    Which datasets you are planning to use.
*    Which datasets you are planning to use.
*    What technologies (programming language, libraries, development and collaboration tools) you are planning to use.
*    What technologies (programming language, libraries, development and collaboration tools) you are planning to use. -->
 
(These points are particularly important if you want to do a project based on your own ideas. If you chose one of the suggests project themes, some of the points may of course be given.)


== Project presentations ==
== Project presentations ==
The final course session on Friday November 23rd will focus on project presentations.
'''Final project presentations:''' Thursday December 8th 1015


Depending a little on the number of project groups, each presentation will be brief: perhaps 20 minutes for 3-person groups, 15 minutes for pairs, and 10 minutes if your are working alone.
Depending a little on the number of project groups, each presentation will be brief: 15 minutes for each group + 5 minutes for questions and comments.


You may demonstrate your project live (most convincing), or you may replay a recorded demonstration (which is good to have as a backup in any case). In addition, I expect each presentation to address/answer at least these points:
You may demonstrate your project live (most convincing), or you may replay a recorded demonstration (which is good to have as a backup in any case). In addition, I expect each presentation to address/answer at least these points:
*    what have you made? - or: what is your application doing?
*    what have you made? - or: what is your application doing?
*    which technologies have you used (languages, libraries, IDEs etc.)
*    which technologies have you used (languages, APIs, other software etc.)
*    which datasets have you used? - or: where did you get your data from? are you using (only) semantic data, or are you lifting (some of) them?
*    where did you get your data from? - and/or: which datasets have you used?
*    which vocabularies are you using? have you created a new vocabulary/ontology, or extend existing ones?
*    why is it a good idea to do this using big data and big data technologies? - or: what does your system do that was not possible (or at least not easy) to do before?
*    why is it a good idea to do this using big data and big data technologies? - or: what does your app do that was not possible (or at least not easy) to do before?
*    exactly what have you done and programmed so far?
*    exactly what have you done and programmed so far?
*    what are you planning to do in the last weeks?
*    what are you planning to do in the final few days?
*    have you got particular problems you need to address?
*    have you got any particular problems you need to address?


== Final project submission ==
== Project submission ==
'''Hard deadline:''' December 02nd 0000
'''Final project submission:''' December 12th.


Submit your project through Inspera as a single ZIP archive. The version of your project that you submit should be anonymous.
''Since the project is graded, this is an official deadline. If you do not submit on time, you will be not allowed to take the course exam a week later.''
Provide a short video (max 5 minutes) that shows your system running, which voice comments.
Comment your code sparsely and in-line. You do not need additional documentation, but you should provide a precise description for how to run your system. For example, explain:
* which additional packages that need to be installed
* which datasets that need to be downloaded
** do not include large datasets >10M in your Zip file
** but it is fine to include smaller test datasets
* if credentials (like a Twitter token) is needed to run the code, explain where they must be added
* which other systems that must be running first (e.g., Kafka, HDFS, YARN)
* how to start your system (in particular if it consists of several programs)
<!--
The end result of the project should be submitted as a ZIP archive through Inspera:
The end result of the project should be submitted as a ZIP archive through Inspera:
*    Just one person in the group shall deliver the group assignment (ZIP file) in Inspera.
*    Just one person in the group shall deliver the group assignment (ZIP file) in Inspera.
Line 53: Line 58:
*      All members in the group must log in to Inspera Assessment (and join the Group), BEFORE one person delivers the group assignment. (Each group can only deliver once)
*      All members in the group must log in to Inspera Assessment (and join the Group), BEFORE one person delivers the group assignment. (Each group can only deliver once)


The submitted ZIP archive should contain your complete project in a single directory, but DO NOT INCLUDE large semantic data sets like DBpedia or other. You do not need to include standard libraries like Jena either. Instead, include a file README.TXT in the root of the project directory to let us know if you have used a particular development environment (like Eclipse), what is needed before your code can be run, and how to run it, and if there are other things to do.
The submitted ZIP archive should contain your complete project in a single directory. Include a file README.TXT in the root of the project directory to let us know if you have used a particular development environment (like Eclipse), what is needed before your code can be run, and how to run it, and if there are other things to do.
 
The file name of your archive should contain the student numbers of everyone in the group. (Not that your student number is different from your student card number...). In addition to your code, the ZIP archive should contain a two-page description of your project (see below). You can also include small example data sets to run the code on, if you want to.


== Submitted code ==
The file name of your archive should contain the student numbers of everyone in the group. (Not that your student number is different from your student card number...).  
You should comment the code lightly in-line. This does not mean one comment line for each code line but, e.g., one brief comment for each logical group of code lines. In particular, we do not require full Javadoc.


== Project description ==
Your ZIP archive should contain a 2-page project description. Put this description in the root folder of your project directory before you ZIP it. The project description file should be anonymous, and contain the exam numbers of all group members, BOTH on the first page and in the file name (e.g., ProjectDescription_102_113.pdf .)
Your ZIP archive should contain a short project description. Put this description in the root folder of your project directory before you ZIP it. The project description file should be anonymous, and contain the exam numbers of all group members, BOTH on the first page and in the file name (e.g., ProjectDescription_102_113.pdf .)


The length of the project description is max 2 A4 pages with 11pt font and 2.5 cm margins. This is a HARD limit. You can have appendices, though, and any figures or tables come in addition to the two pages. The quality of your code is more important than the quality of the 2-page description. You receive a grade on the project, not on the report.
The length of the project description is max 2 A4 pages with 11pt font and 2.5 cm margins. This is a HARD limit. You can have appendices, though, and any figures or tables come in addition to the two pages. The quality of your code is more important than the quality of the 2 page description. You receive a grade on the project, not on the report.


* You should briefly explain the purpose of your system. Why have you made this? Why is it a good idea to do this using semantic technologies? What can you do now that wasn't possible before?
* You should briefly explain the purpose of your system. Why have you made this? Why is it a good idea to do this using big data technologies? What can you do now that wasn't possible before?
* You should probably list the technologies/tools/standards/vocabularies you have used and explain briefly why you chose each of them. Did you consider alternatives? Why were the ones you chose better?
* You should probably list the technologies/tools/standards/ datasets you have used and explain briefly why you chose each of them. Did you consider alternatives? Why were the ones you chose better?
* If you are reading/converting/lifting data from multiple sources and/or using existing tools in addition to your own program, you should probably include a flow chart or architecture sketch (which is different from a class diagram).
* If you are reading/converting/lifting data from multiple sources and/or using existing tools in addition to your own program, you should probably include a flow chart or architecture sketch (which is different from a class diagram).
* You should probably include a class diagram and/or data flow diagram of your system.
* You should probably include a class diagram and/or data flow diagram of your system.
* You should mention any particular problems you have had and/or things you want to do differently next time.
* You should mention any particular problems you have had and/or things you want to do differently next time.
* If you want to briefly describe how to run the code you have submitted, you can do that separately in a README.TXT file.
* If you want to briefly describe how to run the code you have submitted, you can do that separately in a README.TXT file.
-->

Latest revision as of 16:53, 15 November 2022

The project shall develop an application that uses big data technologies on social-media and/or other open data data. At least a part of the project shall use Spark and run in the NREC cloud. The project should be carried out in groups of three, and never more. Working individually or in pairs is possible, but not recommended.

This autumn, we specifically invite projects that use big data for the news.

More information about possible projects, deadlines, and other requirements will appear here soon.

Proposing a theme: deadline

Everyone who intends to take the course must be included in a project proposal sent by email to Andreas.Opdahl@uib.no and with all the group members on Cc. The subject line must contain the string "INFO319 Project Proposal".

Proposal deadline: Wednesday October 12th 1500

Project presentations

Final project presentations: Thursday December 8th 1015

Depending a little on the number of project groups, each presentation will be brief: 15 minutes for each group + 5 minutes for questions and comments.

You may demonstrate your project live (most convincing), or you may replay a recorded demonstration (which is good to have as a backup in any case). In addition, I expect each presentation to address/answer at least these points:

  • what have you made? - or: what is your application doing?
  • which technologies have you used (languages, APIs, other software etc.)
  • where did you get your data from? - and/or: which datasets have you used?
  • why is it a good idea to do this using big data and big data technologies? - or: what does your system do that was not possible (or at least not easy) to do before?
  • exactly what have you done and programmed so far?
  • what are you planning to do in the final few days?
  • have you got any particular problems you need to address?

Project submission

Final project submission: December 12th.

Submit your project through Inspera as a single ZIP archive. The version of your project that you submit should be anonymous.

Since the project is graded, this is an official deadline. If you do not submit on time, you will be not allowed to take the course exam a week later.

Provide a short video (max 5 minutes) that shows your system running, which voice comments.

Comment your code sparsely and in-line. You do not need additional documentation, but you should provide a precise description for how to run your system. For example, explain:

  • which additional packages that need to be installed
  • which datasets that need to be downloaded
    • do not include large datasets >10M in your Zip file
    • but it is fine to include smaller test datasets
  • if credentials (like a Twitter token) is needed to run the code, explain where they must be added
  • which other systems that must be running first (e.g., Kafka, HDFS, YARN)
  • how to start your system (in particular if it consists of several programs)