Cloud Computing

Cloud Computing

Passenger Airline Flights

Description:
There are two files containing lists of data
1. AComp_Passenger_data.csv
2. Top30_airports_LatLong.csv
The first data file contains details of passengers that have flown between airports over a certain period. The data is in a comma delimited text file, one line per record using this format:
Passenger id: Format: XXXnnnnXXn
Flight id: Format: XXXnnnnX
From airport IATA/FAA code: Format: XXX
Destination airport IATA/FAA code: Format: XXX
Arrival time (local): Format: n [10] (This is in Unix ‘epoch’ time)
Total flight time (mins). Format: n [1..4]

The second data file is a list of airport data comprising the name, IATA/FAA code, and location of the airport. The data is in a comma delimited text file, one line per record using this format:
Airport name: Format: X [3..20]
Airport IATA/FAA code: Format: XXX Latitude: Format n.n [3..13]
Longitude: Format n.n [3..13]
Where: X is Uppercase ASCII. n is digit 0..9. [n..m] is the min/max range of the number of digits/characters in a string.
There are various errors in the AComp_Passenger_data.csv input data file; your code should successfully handle these in an appropriate manner. The output can be to screen, but must also be written to text files, the format of which is your decision.
There are two additional data input files: (AComp_Passenger_data_no_error_DateTime.csv & AComp_Passenger_data_no_error.csv) – these can be used during the initial development and debugging phases only. For the final stages of development (i.e. error handling) use the AComp_Passenger_data.csv file. The ‘no_error’ files are not to be used for the software runs that generate the data for the final report, to do so will result in loss of marks.
Objectives
1. Determine the number of flights from each airport; include a list of any airports not used.
2. Create a list of flights based on the Flight id, this output should include the passenger Id, relevant IATA/FAA codes, the departure time, the arrival time (times to be converted to HH:MM:SS format), and the flight times.
3. Calculate the number of passengers on each flight.
4. Calculate the line-of-sight (nautical) miles for each flight and the total travelled by each passenger.

Tasks
1. For this task in the development process, develop a non-MapReduce executable prototype, (in Java or C++). The objective is to develop the basic functional ‘building-blocks’ that will support the development objectives listed above, in a way that mimics something of the operation of the MapReduce/Hadoop framework. The solution may use multi-threading if this suits your particular design and implementation strategy, the marking strategy will reflect the appropriate use of: coding techniques, succinct standard or Javadoc comments (only where really needed), data structures & overall program design. The code should be subject to command line version control using a Subversion repository.
The final results/output must use the AComp_Passenger_data.csv file. Error detection and handling for this task can be quite basic, but it must be robust and follow a logical, well considered strategy – the latter is entirely for you to decide.
2. Write a brief report (no more than 7 pages for the actual content, not including title page) explaining:
a. The high-level description of the development of the prototype software.
b. A simple description of the Subversion command line process undertaken.
c. A fairly detailed description of the MapReduce functions you are replicating.
d. The output format of any reports that each job produces.
e. The strategy derived to handle input data error detection/correction and/or run-time recovery.
f. A self-appraisal of your (equivalent) MapReduce run-time software, with suggestions as to how it may be usefully improved upon. You may comment on any aspect of the development process.
ATLANTA ATL 33.636719 -84.428067
BEIJING PEK 40.080111 116.584556
LONDON LHR 51.4775 -0.461389
CHICAGO ORD 41.978603 -87.904842
TOKYO HND 35.552258 139.779694
LOS ANGELESLAX 33.942536 -118.40808
PARIS CDG 49.012779 2.55
DALLAS/FORTD WFWORTH 32.896828 -97.037997
FRANKFURT FRA 50.026421 8.543125
HONG KONG HKG 22.308919 113.914603
DENVER DEN 39.861656 -104.67318
DUBAI DXB 25.252778 55.364444
JAKARTA CGK -6.125567 106.655897
AMSTERDAMAMS 52.308613 4.763889
MADRID MAD 40.493556 -3.566764
BANGKOK BKK 13.681108 100.747283
NEW YORK JFK 40.639751 -73.778925
SINGAPORE SIN 1.350189 103.994433
GUANGZHOUCAN 23.392436 113.298786
LAS VEGAS LAS 36.080056 -115.15225
SHANGHAI PVG 31.143378 121.805214
SAN FRANCISCSFOO 37.618972 -122.37489
PHOENIX PHX 33.434278 -112.01158
HOUSTON IAH 29.984433 -95.341442
CHARLOTTE CLT 35.214 -80.943139
MIAMI MIA 25.79325 -80.290556
MUNICH MUC 48.353783 11.786086
KUALA LUMPUKURL 2.745578 101.709917
ROME FCO 41.804475 12.250797
ISTANBUL IST 40.976922 28.814606