All Question paper with solution mean Bachelorexam.com

(Aktu Btech) Data Analytics Important Unit-3 Mining Data Streams

B.Tech AKTU Quantum Book will take you on a journey through the realm of Data Analytics. To thrive in this dynamic subject, access crucial notes, repeated questions, and helpful insights. Unit-3 Mining Data Streams

Dudes 🤔.. You want more useful details regarding this subject. Please keep in mind this as well.

Important Questions For Data Analytics:
*Quantum              *B.tech-Syllabus
*Circulars               *B.tech AKTU RESULT
* Btech 3rd Year    * Aktu Solved Question Paper

Q1. Write short note on Data Stream Management System (DSMS). 

Ans. 

Write short note on Data Stream Management System (DSMS). Data Analytics
  • 1. A Data Stream Management System (DSMS) is a computer software system that is used to handle continuous data streams.
  • 2. A DSMS also provides flexible query processing, allowing the information required to be stated via queries.
  • 3. In DSMS, queries, also known as continuous or standing queries, are conducted continually across the data given to the system. These searches are only registered once in the system.
  • 4. A query can be expressed in one of two ways, depending on the system: as a declarative statement or as a sequence or graph of data processing operators.
  • 5. A declarative query is converted into a logical query plan that can be optimized. After that, the logical query is transformed into a physical query execution plan (QEP).
  • 6. The query execution plan includes calls to the operator implementation.
  • 7. In addition to the actual physical operators, query execution plans include queues for buffering the operators’ input and output.
  • In QEPs, synopsis structures serve as a support element.
  • 9. When an operator needs to save some state in order to provide results, DSMS may provide particular synopsis methods and data structures.
  • 10. A synopsis is a summary of the stream or a section of the stream.

Q2. Explain the architecture of Data Stream Management System (DSMS).

Ans. 

Explain the architecture of Data Stream Management System (DSMS). Data Analytics
  • 1. Data stream:
    • a. DSMS gets data streams as input.
    • b. Stream manager for data streams Router Manager of Queues Quality of Service Monitoring Tuples, which adhere to a relational structure containing attributes and values, are used to represent data stream elements.  
  • 2. Stream manager:
    • a. Wrappers are given that can receive raw data from its source, buffer it, and sort it by timestamp.
    • b. The stream manager’s job is to transform data to the data stream management system’s format.
  • 3. Router:
    • a. According to the query execution plan, it is useful to add tuples or data streams to the queue of the next operator.
  • 4. Queue manager:
    • A queue manager is in charge of managing queues and their accompanying buffers.
    • b. If main memory is full, the queue manager can be used to swap data from the queues to secondary storage.
  • 5. System catalog and storage manager:
    • a. Many systems use a storage manager to manage access to secondary storage in order to enable access to data saved on disc.
    • b. When persistent data is merged with data from stream sources, this is employed.
    • c. It’s also necessary for loading meta-data for searches, query plans, streams, inputs, and outputs.
    • d. They are stored in secondary storage in a system catalogue. 
  • 6. Scheduler: 
    • a. Scheduler determines which operator is executed next. 
    • b. The Scheduler interacts closely with the query processor.
  • 7. Query processor: It helps to execute the operator by interacting with scheduler. 
  • 8. QoS monitoring:
    • a. Several systems also feature a monitor that collects statistics on performance, operator output rate, and output latency.
    • b. These statistics can be used to improve system execution in a variety of ways.
  • 9. Query optimizer:
    • a. A load shedder, which is a stream element chosen using a sampling method, can boost the throughput of a system.
    • b. The load shedder might be a component of a query optimizer, a standalone component, or a component of the query execution plan.
    • c. The statistics can be utilized to re-optimize and reorganize the operators in the current query execution plan. A query optimizer can be included for this purpose.

Q3. Write short notes on stream computing.

Ans.

  • 1. Stream computing is a computing paradigm that reads data in stream form from groups of software or hardware sensors and computes continuous data streams.
  • 2. Stream computing employs software systems that process continuous data streams.
  • 3. Stream computing makes use of a software algorithm to examine data in real time.
  • 4. Stream computing, which provides extremely low-latency velocities with massively parallel processing architectures, is one viable technique to accommodate Big Data.
  • 5. It is quickly becoming the quickest and most efficient method of obtaining meaningful knowledge from Big Data.

Q4. Explain Flajolet-Martin algorithm to count the distinct elements in a stream.

Ans.

  • 1. The Flajolet-Martin algorithm estimates the number of distinct objects in a stream or database in a single pass.
  • 2. If the stream has n elements, each of which is unique, this technique runs in O(n) time and requires O(login) memory. The key novelty here is the use of memory, as an exact, brute-force algorithm would require O(m) memory.
  • 3. It gives an approximation for the number of unique objects, along with a standard deviation 𝜎, which can then be used to determine bounds on the approximation with a desired maximum error 𝜖, if needed. 

The Flajolet-Martin algorithm:

  • 1. Create a bit vector (bit array) of sufficient length L, such that 2L > n, the number of elements in the stream. Usually a 64-bit vector is sufficient since 264 is quite large for most purposes. 
  • 2. The i-th bit in this vector/array represents whether we have seen a hash function value whose binary representation ends in 0i. So initialize each bit to 0. 
  • 3. Generate a good, random hash function that maps input (usually strings) to natural numbers. 
  • 4. Read input. For each word, hash it and determine the number of trailing zeros. If the number of trailing zeros is k, set the k-th bit in the bit vector to 1.
  • 5. Once input is exhausted, get the index of the first O in the bit array (call this R). By the way, this is just the number of consecutive ls plus one.
  • 6. Calculate the number of unique words as 2R/𝜙, where 𝜙 is 0.77351. 

Q5. Write short note on Real-Time Analytics Platform (RTAP) with example. 

Ans.

  • 1. A real-time analytics platform enables businesses to make the most of real-time data by assisting them in extracting important information and patterns from it.
  • 2. Such systems aid in measuring data from a business standpoint in real time, allowing for better data utilisation.
  • 3. An ideal real-time analytics platform would aid in the analysis, correlation, and prediction of outcomes in real-time.
  • 4. The real-time analytics platform assists firms in tracking things in real time, which aids in decision-making.
  • 5. The platforms link data sources for improved analytics and visualisation.
  • 6. Real time analytics is the analysis of data as soon as that data becomes available. In other words, users get insights or can draw conclusions immediately the data enters their system.  

Examples of real-time analytics include: 

  • 1. Real time credit scoring, helping financial institutions to decide immediately whether to extend credit.  
  • 2. Customer relationship management (CRM), maximizing satisfaction and business results during each interaction with the customer. 
  • 3. Fraud detection at points of sale. 
  • 4. Targeting individual customers in retail outlets with promotions and incentives, while the customers are in the store and next to the merchandise. 

Q6. Explain the architecture of sentiment analysis. 

Ans. 1. Data collection:

  • a. Customers voice their opinions in public places such as blogs, message boards, and social networking sites.
  • b. Because feelings are represented in a variety of ways, including the context of writing and the use of abbreviations and slang, the data is vast.
  • c. Manual sentiment analysis is nearly impossible. As a result, specialized computer languages such as R are utilized to process and analyze the data.
Explain the architecture of sentiment analysis. Data Analytics

2. Text preparation: Text preparation me ans filtering the mined data before analysis. Text preparation is essentially data pre-processing. 

3. Sentiment detection:

  • a. Each sentence of the opinion is reviewed for subjectivity at this level.
  • b. Sentences containing subjective information are kept, whereas those containing objective terms are deleted.

4. Sentiment classification:

  • a. Sentiments can be divided into two categories: good and negative.
  • b. At this level of the sentiment analysis procedure, each detected subjective sentence is classified as positive, negative, good, bad, like, or dislike.

5. Presentation of output:  

  • a. Sentiment analysis seeks to convert unstructured language into useful data.
  • b. Following the conclusion of the analysis, the text results are displayed on graphs such as pie charts and bar charts.
bachelor exam preparation all question paper with solution important questions with solution

Data Analytics Btech Quantum PDF, Syllabus, Important Questions

LabelLink
Subject SyllabusSyllabus
Short QuestionsShort-question
Question paper – 2021-222021-22

Data Analytics Quantum PDF | AKTU Quantum PDF:

Quantum SeriesLinks
Quantum -2022-232022-23

AKTU Important Links | Btech Syllabus

Link NameLinks
Btech AKTU CircularsLinks
Btech AKTU SyllabusLinks
Btech AKTU Student DashboardStudent Dashboard
AKTU RESULT (One VIew)Student Result

Important Links-Btech (AKTU)

LabelLinks
Btech InformationInfo Link
Btech BranchLINK
Quantum-PageLink

Leave a Comment