Introduction To Big Data - Formative Assessment 1 - HFF
Introduction To Big Data - Formative Assessment 1 - HFF
Learning Objective:
Research is needed in order to comnplete this assignment. Formative assessment 1 will
cover the following concepts:
a. Explore data analytics in Microsoft Fabric.
Attributes/Competencies Assessed:
The learner should demonstrate the following knowledge in this assessment:
a. Unit standard(s)
N/A
Scope:
The scope of this formative assessment is based on additional research, to demonstrate a
solid understanding of Big Data.
Technical Aspects:
The number of pages for this formative assessment is 21 and the following font and size
should be used in your report:
a. Font: Arial
b. Size: 12 and 14 for headings
Save and upload the report as a .PDF with the following naming convention:
a. Student no_StudentName_StudentSurname_ModuleCode_FA1
b. Each team member must upload a copy of the evidence to their profile on the
myAIE portal.
Ensure adequate referencing is used when using information from either books or
internet. Plagiarism is a serious offecne and can result in 0% for the assessment when
excessive work is copied without proper referencing.
Please complete the following and sign as requested for Portfolio of Evidence (POE)
a. Pre-Assessment agreement (Save, sign and submit as PDF)
b. Assessment Feedback Agreement (Save, sign and submit as PDF)
Marking criteria:
Please see marking guideline below in the mark allocation table
Mark allocation for report
See Mark allocation sheet below
Question 1
(100)
Scenario
As a newly recruited Data Scientist, you need to investigate the latest Cloud Technologies in
your field.
You decide to create a Microsoft Fabric workspace and use it to ingest and analyze some
data. This will demonstrate some key elements of a large-scale data analytics solution.
General instructions
a) Under the section “Explore data analytics in Microsoft Fabric” is a series of activities
prescribed by Microsoft to achieve your objective.
b) Use your AIE student number as prefix to any user-specific name required during the
process of creating the solution.
a. As can be seen from this extract from step 4: “Create a new workspace with a
name of your choice….. “.
b. These names need to be unique and your AIE student number need must be
used.
c) Wherever you see the heading, “my screenshot” please insert your evidence with
your student number evident to that specific activity/step.
a. There are eight (8) screenshots required.
Please use this MS Word version at first: which make pasting your evidence easier before
you convert the entire MS Word document to pdf for upload to AMI portal.
Here is an example of using your [email protected] and what the evidence should look
like when answering step 1 with screenshot 1 required.
Note: You will need a Microsoft Fabric license to complete this exercise.
See Getting started with Fabric for details of how to enable a free Fabric trial license.
You will need a Microsoft school or work account to do this..
Use your [email protected] where account is your student number.
Create a workspace
Before working with data in Fabric, create a workspace with the Fabric trial enabled.
a) Sign into Microsoft Fabric at https://app.fabric.microsoft.com.
ANSWER (Please paste your screenshots here)
b) In the menu bar on the left, select Workspaces (the icon looks similar to 🗇).
c) Create a new workspace with a name of your choice, selecting a licensing mode in
the Advanced section that includes Fabric capacity (Trial, Premium, or Fabric).
d) When your new workspace opens, it should be empty.
The data engineering home page includes tiles to create commonly used data
engineering assets.
b) In the Data engineering home page, create a new Lakehouse with a name of your
choice.
After a minute or so, a new lakehouse will be created.
ANSWER (Please paste your screenshots here)
c) View the new lakehouse, and note that the Lakehouse explorer pane on the left
enables you to browse tables and files in the lakehouse:
a. The Tables folder contains tables that you can query using SQL. Tables in a
Microsoft Fabric lakehouse are based on the open source Delta Lake file
format, commonly used in Apache Spark.
b. The Files folder contains data files in the OneLake storage for the lakehouse
that aren’t associated with managed delta tables. You can also create
shortcuts in this folder to reference data that is stored externally.
Currently, there are no tables or files in the lakehouse.
3. Ingest data
(20)
A simple way to ingest data is to use a Copy Data activity in a pipeline to extract the data
from a source and copy it to a file in the lakehouse.
a) On the Home page for your lakehouse, in the Get data menu, select New data
pipeline, and create a new data pipeline named Ingest Sales Data.
b) In the Copy Data wizard, on the Choose a data source page, select the Retail Data
Model from Wide World Importers sample dataset.
c) Select Next and view the tables in the data source on the Connect to data source
page.
d) Select the dimension_stock_item table, which contains records of products. Then
select Next to progress to the Choose data destination page.
e) On the Choose data destination page, select your existing lakehouse. Then select
Next.
f) Set the following data destination options, and then select Next:
a. Root folder: Tables
b. Load settings: Load to new table
c. Destination table name: dimension_stock_item
d. Column mappings: Leave the default mappings as-is
e. Enable partition: Unselected
g) On the Review + save page, ensure that the Start data transfer immediately option is
selected, and then select Save + Run.
A new pipeline containing a Copy Data activity is created, as shown here:
ANSWER (Please paste your screenshots here)
pipeline designer. Use the ↻ (Refresh) icon to refresh the status, and wait until it has
When the pipeline starts to run, you can monitor its status in the Output pane under the
succeeeded.
h) In the hub menu bar on the left, select your lakehouse.
i) On the Home page, in the Lakehouse explorer pane, expand Tables and verify that
the dimension_stock_item table has been created.
Note: If the new table is listed as unidentified, use the Refresh button in the
lakehouse toolbar to refresh the view.
b) In the toolbar, select New SQL query. Then enter the following SQL code into the
query editor:
SqlCopy
c) Select the ▷ Run button to run the query and review the results, which should reveal
that there are two brand values (N/A and Northwind) and show the number of
products in each.
b) In the toolbar, select New report to open a new browser tab containing the Power BI
report designer.
c) In the report designer:
a. In the Data pane, expand the dimension_stock_item table and select the Brand
and StockItemKey fields.
b. In the Visualizations pane, select the Stacked bar chart visualization (it’s the
first one listed). Then ensure that the Y-axis contains the Brand field and
change the aggregation in the X-axis to Count so that it contains the Count of
StockItemKey field. Finally, resize the visualization in the report canvas to fill
the available space.
ANSWER (Please paste your screenshots here)
Tip: You can use the » icons to hide the report designer panes in order to see the
report more clearly.
d) On the File menu, select Save to save the report as Brand Quantity Report in your
Fabric workspace.
You can now close the browser tab contaning the report to return to your lakehouse. You
can find the report in the page for your workspace in the Microsoft Fabric portal.
Question 1.1a 10
Question 1.1b 10
Question 1.2 10
Question 1.3a 10
Question 1.3b 10
Body of the
report
Question 1.4 10
Question 1.5a 10
Question 1.5b 10
Question 1.6 10
1 day late
-5
2 days late
Deductions -10
3 days late
-15
Total: 100
PRE-ASSESSMENT AGREEMENT
Assessment Process
Evaluation of POE addressing Essential
Embedded Knowledge in unit standards.
Evaluation of Research Projects and other
evidence addressing specific unit standards.
Consultation: assessment plan and assessment
activities and instruments. Pre-assessment
moderation and interviews conducted at this
stage.
Observation: feedback on assessment against
specific outcomes, critical outcomes in unit
standards.
Feedback: to candidate regarding sufficiency of
evidence and possible interview to gain
supplementary evidence.
Feedback to candidate regarding assessment
findings as well as review process.
Feedback Written feedback to be given to all stakeholders at the
end of the assessment process, as well as verbal
feedback to the candidate during assessment activities.
Recording Process and findings to be recorded and submitted for
Process record keeping purposes as well as moderation and
verification.
Review The review process is the responsibility of the assessor
Process and the candidate. Joint reviewing will take place after
feedback has been given to the candidate.
Right to The candidate must be advised of the right to appeal.
appeal
Resources Assignments
Required POE
Assessments
Guides
I confirm that:
I have been consulted on and have agreed to the training and
assessment process as detailed in the assessment guide.
I have been advised of my right to appeal against any assessment that
is unfair, unreliable, invalid or impracticable.
I have read and understood the appeal procedure.
I know that assessments may be moderated or verified by an external
party.
The purpose of the assessment has been clearly explained to me.
The criteria have been discussed with me, and I know I will be assessed
against these criteria.
I know when and where I will be assessed, and I was given fair notice.
I know how the assessment will be done, and any other requirements
related to the assessment.
Assessor’s
Date:
Signature
Moderator’s
Date:
Signature
Qualification Name:
Qualification SAQA
Number:
Subject Name: Introduction to Big Data
Unit standard
Number(s)
Question 0
Question 1.1a
Question 1.1b
Question 1.2
Question 1.3a
Question 1.3b
Question 1.4
Question 1.5a
Question 1.5b
Question 1.6
Learner Signature:
Lecturer Signature:
Assessor Signature:
Moderator Signature:
Note to learner
Review the feedback provided by your lecturer to check that you have been
found competent in this assessment. If there are any areas where you have
been found not yet competent, you must redo those parts of the assessment
and resubmit within the stipulated time frame.
The section below will only be completed in cases where the learner was asked
to
resubmit parts of the assessment where they were found not yet competent.
General feedback to learner (Attempt 2)
Supply comprehensive feedback why learner is found NYC
Learner Number:
Learner Signature:
Lecturer Signature:
Assessor Signature:
Moderator Signature: