MCA-20-41: Big Data and Pattern Recognition
Type: Compulsory
Contact Hours: 4 hours/week
Examination Duration: 3 Hours
Mode: Lecture
External Maximum Marks: 75
External Pass Marks: 30(i.e. 40%)
Internal Maximum Marks: 25
Total Maximum Marks: 100
Total Pass Marks: 40(i.e. 40%)
Instructions to paper setter for End semester exam:
Total number of questions shall be nine. Question number one will be compulsory and will be consisting of short/objective type questions from complete syllabus. In addition to compulsory first question there shall be four units in the question paper each consisting of two questions. Student will attempt one question from each unit in addition to compulsory question. All questions will carry equal marks.
Course Objectives: The aim of this course is to develop knowledge of big data tools including MapReduce, NoSQL and Hadoop. The course provides an idea about data analysis; pattern recognition approaches and gives the practical exposure of NoSQL.
Course Outcomes (COs) At the end of this course, the student will be able to:
MCA-20-41.1 understand Big Data strategies in Big Data Environment;
MCA-20-41.2 learn Basics of HDFS and Learn map-reduce analytics using Hadoop;
MCA-20-41.3 acquire knowledge of pattern recognition approaches and methods;
MCA-20-41.4 to develop solutions in NoSQL to meet the current job requirements.
UNIT – I
Understanding Big Data: Concepts and Terminology, Big Data Characteristics, Different Types of Data, Identifying Data Characteristics, Business Motivations and Drivers for Big Data Adoption: Business Architecture, Business Process Management, Information and Communication Technology, Big Data Analytics Lifecycle, Enterprise Technologies and Big Data Business Intelligence, Industry examples of big data.
UNIT – II
Data Governance for Big Data Analytics: Evolution of Data Governance, Big Data and Data Governance, Big Datasets, Big Data Oversight, Big Data Tools and Techniques: HDFS, Map Reduce, YARN, Zookeeper, HBase, HIVE, Pig, Mahout, Developing Big Data Applications, Stepwise Approach to Big Data Analysis, Big Data Failure: Failure is common, Failed Standards, Legalities.
UNIT – III
Data Analysis and Pattern Recognition: Quantitative and Qualitative Analysis, Pattern Recognition Systems, Fundamental Problems in Pattern Recognition, Feature Extraction and Reduction, Paradigms, Pattern Recognition Approaches, Importance and Applications. Data Domain for Pattern Recognition. Pattern Recognition using Nearest Neighbour Classifier and Modeling an AND Gate Neural Nets.
UNIT – IV
An Overview of NoSQL, Characteristics of NoSQL, NoSQL Storage Types, Introduction of NoSQL Products, NoSQL Data Management for Big Data: Schema Less Models, Key-Value Stores, Document Stores, Tabular Stores, Object Data Stores, Graph databases, NoSQL Misconceptions, NoSQL over RDBMS.
Text Books:
⦁ Thomas Erl, WajidKhattak and Paul Buhler, Big Data Fundamentals Concepts, Drivers & Techniques Prentice Hall.
⦁ David Loshin, Big Data Analytics from Strategic Planning to Enterprise Integration with Tools, Techniques, NoSQL, and Graph Morgan Kaufmann.
⦁ Jules J. Berman, Principles of Big Data Preparing, Sharing and Analyzing Complex Information, Morgan Kaufmann.
⦁ GauravVaish, Getting Started with NoSQL, Packt Publishing.
⦁ RajjanShinghal, Pattern Recognition Techniques and Applications, Oxford Higher Education.
Reference Books:
⦁ Michael Berthold, David J. Hand, Intelligent Data Analysis, Springer.
⦁ Jay Liebowitz, Big Data and Business Analytics, Auerbach Publications, CRC press.
⦁ Pete Warden, Big Data Glossary, O’Reily.
⦁ Michael Mineli, Michele Chambers, AmbigaDhiraj, Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses, Wiley Publications.