Infrastruktura za elektronsko poslovanje. dr Miloš CVETANOVIĆ

Similar documents
Uvod u relacione baze podataka

KLASIFIKACIJA NAIVNI BAJES. NIKOLA MILIKIĆ URL:

Projektovanje paralelnih algoritama II

Slika 1. Slika 2. Da ne bismo stalno izbacivali elemente iz skupa, mi ćemo napraviti još jedan niz markirano, gde će

Iskazna logika 1. Matematička logika u računarstvu. oktobar 2012

Mathcad sa algoritmima

Algoritam za množenje ulančanih matrica. Alen Kosanović Prirodoslovno-matematički fakultet Matematički odsjek

Building Cognitive Applications

TEORIJA SKUPOVA Zadaci

Introduction to Hash Tables

Databases Exam HT2016 Solution

Homework Assignment 2. Due Date: October 17th, CS425 - Database Organization Results

Algorithms for Data Science

Database Systems SQL. A.R. Hurson 323 CS Building

CS425: Algorithms for Web Scale Data

A Comparison Between MongoDB and MySQL Document Store Considering Performance

CS246 Final Exam. March 16, :30AM - 11:30AM

Rešenja zadataka za vežbu na relacionoj algebri i relacionom računu

Databases 2011 The Relational Algebra

Dictionary: an abstract data type

Modified Zagreb M 2 Index Comparison with the Randi} Connectivity Index for Benzenoid Systems

compare to comparison and pointer based sorting, binary trees

Supplementary Material

Chap 2: Classical models for information retrieval

Recurrence Relations

1. Write a program to calculate distance traveled by light

Table of content. Understanding workflow automation - Making the right choice Creating a workflow...05

Mass Asset Additions. Overview. Effective mm/dd/yy Page 1 of 47 Rev 1. Copyright Oracle, All rights reserved.

Hashing Data Structures. Ananda Gunawardena

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

In-Database Factorised Learning fdbresearch.github.io

Information Systems (Informationssysteme)

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL---Part 1

Fajl koji je korišćen može se naći na

COMPUTING SIMILARITY BETWEEN DOCUMENTS (OR ITEMS) This part is to a large extent based on slides obtained from

New York City Public Use Microdata Areas (PUMAs) for 2010 Census

Finding similar items

Study Guide Forces One Dimension Answers

Data-Intensive Distributed Computing

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Course Announcements. Bacon is due next Monday. Next lab is about drawing UIs. Today s lecture will help thinking about your DB interface.

AMORTIZED ANALYSIS. binary counter multipop stack dynamic table. Lecture slides by Kevin Wayne. Last updated on 1/24/17 11:31 AM

Subwatersheds File Geodatabase Feature Class

Asocijativna polja POGLAVLJE Ključevi kao cijeli brojevi

EENG/INFE 212 Stacks

CS 347 Parallel and Distributed Data Processing

Lecture 7: Amortized analysis

Sub-Queries in SQL SQL. 3 Types of Sub-Queries. 3 Types of Sub-Queries. Scalar Sub-Query Example. Scalar Sub-Query Example

Lower and Upper Bound Theory

Insert Sorted List Insert as the Last element (the First element?) Delete Chaining. 2 Slide courtesy of Dr. Sang-Eon Park

IR Models: The Probabilistic Model. Lecture 8

Lectures 6. Lecture 6: Design Theory

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

ZANIMLJIV NAČIN IZRAČUNAVANJA NEKIH GRANIČNIH VRIJEDNOSTI FUNKCIJA. Šefket Arslanagić, Sarajevo, BiH

CMPT 354: Database System I. Lecture 9. Design Theory

Why duplicate detection?

Title: Count-Min Sketch Name: Graham Cormode 1 Affil./Addr. Department of Computer Science, University of Warwick,

Circuit Simulation By Farid N. Najm

Window-aware Load Shedding for Aggregation Queries over Data Streams

From Non-Negative Matrix Factorization to Deep Learning

Yes, the Library will be accessible via the new PULSE and the existing desktop version of PULSE.

DATA MINING LECTURE 6. Similarity and Distance Sketching, Locality Sensitive Hashing

Databases through Python-Flask and MariaDB

MoDeNa. Deliverable D4.2 Model and data containers. Delivery date: Authors: Sigve Karolius Cansu Birgen Heinz Preisig Henrik Rusche

The distribution of characters, bi- and trigrams in the Uppsala 70 million words Swedish newspaper corpus

CS 5321: Advanced Algorithms Amortized Analysis of Data Structures. Motivations. Motivation cont d

b. hurricanescale.xml - Information on the Saffir-Simpson Hurricane Scale [3].

:s ej2mttlm-(iii+j2mlnm )(J21nm/m-lnm/m)

Comparing whole genomes

File Geodatabase Feature Class. Tags platts, price assessement, crude oil, crude, petroleum

CUNY Tax Lots, New York NY, Sept 2016

Lecture 24: Bloom Filters. Wednesday, June 2, 2010

An Optimal Algorithm for l 1 -Heavy Hitters in Insertion Streams and Related Problems

Factorized Relational Databases Olteanu and Závodný, University of Oxford

SIGNAL COMPRESSION. 8. Lossy image compression: Principle of embedding

Complete Faceting. code4lib, Feb Toke Eskildsen Mikkel Kamstrup Erlandsen

b. states.xml Listing of states and their postal abbreviations.

Elite Galaxy Online. API Documentation v Elite Galaxy Online. All rights reserved

PRIPADNOST RJEŠENJA KVADRATNE JEDNAČINE DANOM INTERVALU

24. Balkanska matematiqka olimpijada

Comp 11 Lectures. Mike Shah. June 20, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures June 20, / 38

Bloom Filters, general theory and variants

Crypto Lab 2011: Code Challenge Website - Report (v1.0)

Part 1: Hashing and Its Many Applications

Virtual Beach Making Nowcast Predictions

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

Theorem/Law/Axioms Over (.) Over (+)

( ).666 Information Extraction from Speech and Text

OA07 ANNEX 4: SCOPE OF ACCREDITATION IN CALIBRATION

Desirable properties of decompositions 1. Decomposition of relational schemes. Desirable properties of decompositions 3

LED Lighting Facts: Manufacturer Guide

Advanced Implementations of Tables: Balanced Search Trees and Hashing

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional)

Hash tables. Hash tables

Similarity Search. Stony Brook University CSE545, Fall 2016

Searching, mainly via Hash tables

Module 1: Analyzing the Efficiency of Algorithms

Product Function Matrix and its Request Model

django in the real world

Transcription:

NoSQL Infrastruktura za elektronsko poslovanje dr Miloš CVETANOVIĆ drzaharije RADIVOJEVIĆĆ

SQL Mongo (NoSQL) database database table collection row document or BSON document column field index index table joins embedded documents, linking primary key primary key ( _id ) group by aggregation framework Osobine: Dokument orjentisana baza podataka Podrška za dinamičke upite Podrška dšk za različite č tipove indeksa dk Podrška za replikaciju i visoki nivo dostupnosti Podrška za horizontalnu skalabilnost Podrška za atomične modifikacije ij Podrška za map/reduce poslove Podrška za čuvanje velikih dokumenata

Relacioni pristup media _media _cds _id, artist, title, genre, releasedate _ cd_tracklists _cd_id, songtitle, length Ne relacioni pristup media _media _items _<document> { "Type": "CD", "Artist": "Nirvana", "Title": "Nevermind", "Genre": "Grunge", "Releasedate": "1991.09.24", "Tracklist": " [ {"Track":"1", "Title" : "Smells Like Teen Spirit", "Length" : "5:02", {"Track":"2", "Title" : "In Bloom", "Length" : "4:15" ] Binarni JSON (JavaScript Object Notation) BSON {_id: ObjectId(XXXXXXXXXXXX), Hello: world \x27 \x00 \x00 \x00 \x07 _ i d \x00 X X X X X X X X X X X X \x02 h e l l o \x00 \x06 \x00 \x00 \x00 w o r l d \x00 \x00 _id polje 12B ( automatski ili pozivom ObjectId() ) 0 1 2 3 4 5 6 7 8 9 10 11 Vreme Mašina Proces Brojač

Osnovne komande ( raspakova arhivu kreira C:\data\db pokrenu ) > use library Switched to db library > show dbs admin local > show collections system.indexes > document = ( { "Type" : "Book", "Title" :"Definitive Guide to MongoDB, the", "ISBN" : "987 1 4302 3051 9", "Publisher" : "Apress", "Author" : ["Membrey, Peter","Plugge, Eelco","Hawkins, Tim"] ) > db.media.insert(document) > db.media.insert( { "Type" : "CD, "Artist" : "Nirvana, "Title" : "Nevermind", "Tracklist" : [ { "Track" : "1, "Title" : "Smells like teen spirit, "Length" : "5:02, { "Track" : "2, "Title" : "In Bloom, "Length" : "4:15 ] )

> db.media.find() { "_id" : "ObjectId("4c1a8a56c603000000007ecb"), "Type" : "Book", "Title" : "Definitive Guide to MongoDB, the", "ISBN" : "987 4302 3051 9", "Publisher" : "Apress", "Author" : ["Membrey Membrey, Peter", "Plugge, Eelco", "Hawkins, Tim"] { "_id" : "ObjectId("4c1a86bb2955000000004076"), "Type" : "CD", "Artist" : "Nirvana", "Title" : "Nevermind", "Tracklist" : [ { "Track" : "1", "Title" : "Smells like teen spirit", "Length" : "5:02", { "Track" : "2", "Title" : "In Bloom", "Length" : "4:15" ]

> db.media.find ( { Artist : "Nirvana" ) { "_id" : "ObjectId("4c1a86bb2955000000004076"), "Type" : "CD", "Artist" : "Nirvana", "Title" : "Nevermind", "Tracklist" : [ { "Track" : "1", "Title" : "Smells like teen spirit", "Length" : "5:02", { "Track" : "2", "Title" : "In Bloom", "Length" : "4:15" ]

> db.media.find( { "Tracklist.Title" : "In Bloom" ) { "_id" : "ObjectId("4c1a86bb2955000000004076"), "Type" : "CD", "Artist" : "Nirvana", "Title" : "Nevermind", "Tracklist" : [ { "Track" : "1", "Title" : "Smells like teen spirit", "Length" : "5:02", { "Track" : "2", "Title" : "In Bloom", "Length" : "4:15" ] > db.media.find ( { "Tracklist" : {"Track" : "1" ) ne vraća rezultat, jer je neophodno da se podobjekat u potpunos poklapa

> db.media.find().sort( { Title: 1 ) rastuća sor ranost 1, opadajuća sor ranost 11 > db.media.find().limit( 10 ) > db.media.find().skip( ( 20 ) > db.media.find().sort ( { Title : 1 ).limit ( 10 ).skip ( 20 ) > db.media.find().limit( 1 ) > db.media.findone() > db.media.find ( {Artist : "Nirvana", {Title: 1 ) { "_id" : ObjectId("4c1a86bb2955000000004076"), "Title" : "Nevermind" Title: 0 vraća sva polja izuzev Title

> db.createcollection("audit100", { capped:true, size:20480, max: 100) { "ok" : 1 > db.audit100.validate() { "ns" : "media.audit100", "result" : " validate capped:1 max:100 firstextent:0:54000 ns:media.audit100 lastextent:0:54000 ns:media.audit100 # extents:1 datasize?:0 nrecords?:0 lastextentsize:20736 padding:1 first extent: loc:0:54000 xnext:null xprev:null nsdiag:media.audit100 size:20736 firstrecord:null lastrecord:null capped outoforder:0 (OK) 0 objects found, nobj:0 0 bytes data w/headers 0 bytes data wout/headers deletedlist: 1100000000000000000 deleted: n: 2 size: 20560 nindexes:0 ", "ok" : 1, "valid" : true, "lastextentsize" : 20736 fiksna veličina kolekcije, brišu se najstariji j dokumenti automatski, dokument ne može da menja veličinu, dokument ne može biti obrisan > db.audit.find().sort( { $natural: 1 ).limit ( 10 ) vraća poslednjih 10 doda h zapisa

> db.media.count() 2 > db.media.find( { Publisher : "Apress", Type: "Book" ).count() 1 > db.media.find( { Publisher: "Apress", Type: "Book" ).skip ( 2 ).count (true) 0 count (true), u suprotnom count ignoriše skip i limit > db.media.distinct( "Title") [ "Definitive Guide to MongoDB, the", "Nevermind" ] > db.media.group p( { key: {Title : true, initial: {Total : 0, reduce : function (items,prev) { prev.total += 1 ) [ { "Title" : "Nevermind, "Total" : 1, {"Title" : "Definitive Guide to MongoDB, the", "Total" : 2] key funkcija koja obavlja grupisanje cond dodatni uslov za svaki dokument finalize funkcija pre prikaza rezltata

> dvd = ( { "Type" : "DVD", "Title" : "Matrix, The", "Released" : 1999, "Cast" : ["Keanu Reeves","Carry Anne Moss","Laurence Fishburne","Hugo Weaving","Gloria Foster","Joe Pantoliano"] ) > db.media.insert(dvd) > dvd = ( { "Type" : "DVD", "Title" : "Blade Runner", "Released" : 1982 ) > db.media.insert(dvd) > dvd = ( { "Type" : "DVD", "Title" : "Toy Story 3", "Released" : 2010 ) > db.media.insert(dvd) > db.media.find ( { Released : {$gte : 1999, { "Cast" : 0 ) { "_id" : ObjectId("4c43694bc603000000007ed1"), "Type" : "DVD", "Title" :"Matrix, The", "Released" : 1999 { "_id" : ObjectId("4c4369a3c603000000007ed3"), "Type" : "DVD", "Title" :"Toy Story 3", "Released" : 2010 > db.media.find( {Released : {$gte: 1990, $lt : 2010, { "Cast" : 0 ) { "_id" : ObjectId("4c43694bc603000000007ed1"), "Type" : "DVD", "Title" : "Matrix, The", "Released" : 1999 > db.media.find( { Type : "Book", Author: {$ne : "Plugge, Eelco") > db media find( {Released : {$in : ["1999" "2008" "2009"] { "Cast" : 0 ) > db.media.find( {Released : {$in : [ 1999, 2008, 2009 ], { Cast : 0 ) { "_id" : ObjectId("4c43694bc603000000007ed1"), "Type" : "DVD", "Title" : "Matrix, The", "Released" : 1999

> db.media.find( {Released : {$nin : ["1999","2008","2009"],Type : "DVD", { "Cast" : 0 ) { "_id" : ObjectId("4c436969c603000000007ed2"), "Type" : "DVD", "Title" : "Blade Runner", "Released" : 1982 { "_id id" : ObjectId("4c4369a3c603000000007ed3") ), "Type" : "DVD", "Title" : "Toy Story 3", "Released" : 2010 > db.media.find ( { Released : {$all : ["2010","2009"] ], { "Cast" : 0 ) > db.media.find({ "Type" : "DVD", $or : [ { "Title" : "Toy Story 3", { "ISBN" : "987 1 4302 3051 9" ] ) { "_id" : ObjectId("4c5fc943db290000000067ca"), "Type" : "DVD", "Title" :"Toy Story 3", "Released" : 2010 > db.media.find ( { Released : { $mod: [2,0], {"Cast" : 0 ) > db.media.find({"title" : "Matrix, The", {"Cast" : {$slice: 3) { "id" "_id" : ObjectId("4c5fcd3edb290000000067cb"), d3 db290000000067 b") "Type"" : "DVD", "Title" " : "Matrix, The", "Released" : 1999, "Cast" : [ "Keanu Reeves", "Carry Anne Moss", "Laurence Fishburne" ] skip i limit ne rade nad nizovima npr. $slice: [ 5,4] skoči na poslednjih 5, limitira na 4

> db.media.find ( { Tracklist : {$size : 2 ) samo oni dokumen sa tačnom dužinom niza > db.media.find ( { Author : {$exists : true ) samo ukoliko klk polje Author postoji > db.media.find ( { Tracklist: { $type : 3 ) samo ukoliko je polje Tracklist odgovarajućeg pa 1 MiniKey, 1 Double, 2 Character string (UTF8), 3 Embedded object, 4 Embedded array,5 Binary Data, 7 Object ID, 8 Boolean type, 9 Date type, 10 Null type, 11 Regular Expression, 13 JavaScript Code, 14 Symbol, 15 JavaScript Code with scope, 16 32 bit integer, 17 Timestamp, 18 64 bit integer, 127 MaxKey, 255 Min Key > db.media.find ( { Tracklist: { "$elemmatch" : { Title: "Smells like teen spirit", ii"track : "1" ) pronalazi samo ako upari čitav dokument unutar niza bitno se razlikuje od > db.media.find ( { "Tracklist.Title" Title" : "Smells like teen spirit", "Tracklist.Track" Track" : "1" )

> db.media.update( { "Title" : "Matrix, the", {"Type" : "DVD", "Title" : "Matrix, the", "Released" : "1999", "Genre" : "Action", true) ukoliko je true, znači radi upsert (update, a ako ne postoji onda insert) > db.media.save( { "Title" : "Matrix, the", {"Type" : "DVD", "Title" : "Matrix, the", "Released" : "1999", "Genre" : "Action") > manga= ( { "Type" : "Manga", "Title" : "One Piece", "Volumes" : 612, "Read" : 520 ) > db.media.insert(manga) > db.media.update ( { "Title" : "One Piece", {$inc: {"Read" : 4 ) > db.media.update ( { "Title" : "Matrix, the", {$set : { Genre :"Sci Fi" ) postavlja > db.media.update ( {"Title": "Matrix, the", {$unset : { "Genre" : 1 ) briše za niz: > db.media.update di d ( {"ISBN"" : "1 4302 3051 7", " {$push: { Author : "Griffin, Stewie" ) > db.media.update ( {"ISBN" : "1 4302 3051 7", {$pull : { Author : "Griffin, Stewie" ) > db.media.update( di d { "ISBN" : "1 4302 3051 7" 3051, {$pop : {Author : 1 ) vrednost 1 znači poslednjeg dodatog, vrednost 1 znači prvog dodatog > db media update( {"ISBN" : "1 4302 3051 7" {$pushall: {Author : ["Griffin > db.media.update( { ISBN : 1 4302 3051 7,{$pushAll: {Author : [ Griffin, Louis","Griffin, Peter"] ) > db.media.update( { "ISBN" : "1 4302 3051 7", {$addtoset : { Author :"Griffin, Brian" )

$set, $unset, $inc, $push, $pushall, $pull, $pullall atomične operacije update if current: > db.media.update( { "Tracklist.Title" : "Been a son", {$inc:{"tracklist.$.track" : 1 ) > db.$cmd.findone({getlasterror:1) d do t { "err" : null, "updatedexisting" : true, "n" : 1, "ok" : 1 modify and return: > db.media.findandmodify( { query: { "ISBN" : "987 1 4302 3051 9", sort: {"Title": 1, update: {$set: {"Title" : " Different Title" ) > db.media.findandmodify( { query: { "ISBN" : "987 1 4302 3051 9", sort: {"Title": 1, update: {$set: {"Title" : " Different Title", new:true ) vraća novu, promenjenu, vrednost

manuelno referisanje: > apress = ( { "_id" : "Apress", "Type" : "Technical Publisher", "Category" : ["IT", "Software","Programming"] ) > db.publisherscollection.insert(apress) > book = ( { "Type" : "Book", "Title" : "Definitive Guide to MongoDB, the", "ISBN" : "987 1 4302 3051 9", "Publisher" : "Apress","Author" : ["Membrey, Peter","Plugge, Eelco","Hawkins, Tim"] ) > db.media.insert(book) > book = db.media.findone(); db.publisherscollection.findone( { _id : book.publisher ) upotreba DBRef (otporno na promenu naziva): { $ref : <collectionname>, $id : <id value>[, $db : <database name>] > apress = ({"Type" :"Technical Publisher", "Category":["IT","Software","Programming"] ) > db.publisherscollection.save(apress) > book = { "Type" : "Book", "Title" : "Definitive Guide to MongoDB, the", "ISBN":"987 1 4302 3051 9","Author": 305 9 : ["Membrey, Peter","Plugge, Plugge, Eelco","Hawkins, Tim"], Publisher : [ new DBRef ('publisherscollection, apress._id) ] > db.media.save(book)

db runcommand ({ mapreduce: <collection>, map: <function>, reduce: <function>, out: <output>, query: <document>, sort: <document>, limit: <number>, finalize: <function>, scope: <document>, jsmode: <boolean>, verbose: <boolean> ); Invertovano indeksiranje db.runcommand ({ map: <word, docid> reduce: <word, list(docid)> Reverzni graf Web linkova map: <dstid, srcid> reduce: <dstid, list(srcid)> Prebrojavanjepristupaodređenom određenom URL map: <URL, 1> reduce: <URL, totalcount>

Osobine Map funkcije: function() {... emit(key, value); Referencira trenutni dokument koristeći this Ne sme pristupati bazi podataka Ne sme imati bočne efekte Mora da pozove emit(key,value) funkciju Jedan emit ima ograničenje veličine (npr. 50% max document ~ 8MB) Nijeograničen broj pozivaemitpodokumentu Može da pristupa promenljivama definisanim unutar scope parametra Osobine Reduce funkcije: function(key, values) {... return result; Ne sme pristupati bazi podataka Ne sme imati bočne efekte Ne poziva se za one ključeve koji imaju samo jednu vrednost Može da pristupa promenljivama definisanim unutar scope parametra Tip povratne vrednosti mora biti identičan tipu koji ima value u map Funkcija mora biti idempotentna Rezultat funkcije ne sme zavisiti od redosleda unutar values niza d (k [ C d (k [ A B ]) ] ) d (k [ C A B ] ) reduce(key, [ C, reduce(key, [ A, B ]) ] ) == reduce( key, [ C, A, B ] ) reduce( key, [ reduce(key, valuesarray) ] ) == reduce( key, valuesarray )

map/reduce (M/R) posao koji vraća ukupnu cenu svih narudžbina svakog od kupaca: { _id: ObjectId("50a8240b927d5d8b5891743c"), cust_id: "abc123", ord_date: ddt new Dt("Ot04 Date("Oct 04, 2012"), status: A, price: 250, items: [ { sku: "mmm", qty: 5, price: 2.5 25, { sku: "nnn", qty: 5, price: 2.5 25 ] var mapfunction1 = function() { emit(this.cust_id, this.price); ; var reducefunction1 = function(keycustid, valuesprices) { return Array.sum(valuesPrices); ; db.orders.mapreduce( mapfunction1, reducefunction1, { out: "map_reduce_example" ); { out: replace result_collection ll ti menjapostojeću j ć klkij kolekciju ukoliko postoji { out: merge result_collection dopunjuje postojeću, a istovetne ključeve zamenjuje { out: reduce result_collection dopunjuje postojeću, a istovetne ključeve redukuje

M/R posao koji za svaku naručenu stavku vraća prosečnu količinu po narudžbini: var mapfunc2 = function() { for (var idx = 0; idx < this.items.length; idx++) { var key = this.items[idx].sku; it var value = {count: 1, qty: this.items[idx].qty ; emit(key, value); var reducefunc2 = function(keysku, countobjvals) { reducedval = { count: 0, qty: 0 ; ; for (var idx= 0; idx< countobjvals.length; l th idx++) { reducedval.count += countobjvals[idx].count; reducedval.qty += countobjvals[idx].qty; return reducedval; ; var finalizefunc2 = function (key, reducedval) { reducedval.avg = reducedval.qty/reducedval.count; return reducedval; ; db.orders.mapreduce( mapfunc2, reducefunc2, { out: merge: "map_reduce_example", finalize: finalizefunc2);

inkrementalni M/R posao koji na osnovu kolekcije sa sesijama vraća za svakog korisnika, ukupan broj sesija tog korisnika, kao i ukupno trajanje i prosečno trajanje sesija tog korisnika: { _id: ObjectId("50a8240b927d5d8b5891743c"), userid: "a", ts: ISODate( 2011 11 03 14:17:00 ), length: 95 var mapfunction = function() { var key = this.userid; var value = {userid: this.userid, total_time: this.length, count: 1, avg_time: 0 ; emit( key, value ); ; ; var reducefunction = function(key, values) { var reducedobject = {userid: key, total_time: 0, count:0, avg_time:0 ; values.foreach( function(value) { reducedobject.total_time += value.total_time; reducedobject.count += value.count; ); return reducedobject; var finalizefunction = function (key, reducedvalue) { if (reducedvalue.count > 0) reducedvalue.avg_time = reducedvalue.total_time / reducedvalue.count; return reducedvalue; ;

Početno stanje: db.sessions.save( save( { userid: "a", ts: ISODate( 2011 2011 11 03 03 14:17:00 ) ), length: 95 ); db.sessions.save( { userid: "b", ts: ISODate( 2011 11 03 14:23:00 ), length: 110 ); db.sessions.save( { userid: "c", ts: ISODate( 2011 11 03 15:02:00 ), length: 120 ); db.sessions.save( save( { userid: "d", ts: ISODate( 2011 11 03 03 16:45:00 ) ), length: 45 ); db.sessions.save( { userid: "a", ts: ISODate( 2011 11 04 11:05:00 ), length: 105 ); db.sessions.save( { userid: "b", ts: ISODate( 2011 11 04 13:14:00 ), length: 120 ); db.sessions.save( { userid: "c", c, ts: ISODate( 2011 11 04 04 17:00:00 ), length: 130 ); db.sessions.save( { userid: "d", ts: ISODate( 2011 11 04 15:37:00 ), length: 65 ); db.sessions.mapreduce( mapfunction, reducefunction, { out: { reduce: "session_stat", finalize: finalizefunction ); Dopunjeno stanje: db.sessions.save( { userid: "a", ts: ISODate( 2011 11 05 14:17:00 ), length: 100 ); db.sessions.save( { userid: "b", ts: ISODate( 2011 11 05 14:23:00 ), length: 115 ); db.sessions.save( { userid: "c", ts: ISODate( 2011 11 05 15:02:00 ), length: 125 ); db.sessions.save( { userid: "d", ts: ISODate( 2011 11 05 16:45:00 ), length: 55 ); db.sessions.mapreduce( mapfunction, reducefunction, { query: { ts: { $gt: ISODate( 2011 2011 11 05 05 00:00:00 ), out: { reduce: "session_stat", finalize: finalizefunction );

Osobine Agregate funkcije: ZamenjujeMap/Reduce u slučajujednostavnijihposlovajednostavnijih Deklara vnog pa JavaScript nije potreban Definiše lanac operacija koje je potrebno izvršiti (pipeline) Izračunavanje iskaza vraća izračunate vrednos Brža implementacija (C++ umesto JavaScript) Planirano dodavanje novih operacija $project [oblikuje rezultat] t] uključuje i isključuje određena polja iz rezultata izračunava iskaze (poziva ugrađene funkcije, čita/upisuje ugnježdeni dokument) $unwind [razmotava nizove] jedan po jedan član niza uparuje sa ostatkom okolnog dokumenta $sort [sortira dokumente] npr. $sort: {key1: 1, key2: 1,... $match [filtrira na osnovu predikata] isti format kao kod find() metode $group [grupiše] definiše ključ _id po kome se radi grupisanje agregira grupisane vrednosti $sum, $avg, $min, $max formira niz ili skup na osnovu grupisanih vrednosti $push, $addtoset dodatne funkcije $first, $last imaju smisla samo nakon sortiranja $limit [ograničava broj dokumenata] $skip [preskače određen broj dokumenata] $geonear [sortira dokumenta na osnovu geografske udaljenosti] mora biti prvi u lancu

kolekcija zipcodes sa informacijama o gradovima: { "_id": "10280", "city": "NEW YORK", "tt" "state": "NY", "pop": 5574, "loc": [ 74.016323, 40.710537 ] skript koji vraća sve države sa više od 10 miliona stanovnika db.zipcodes.aggregate( { $group : { _id : "$state", totalpop : { $sum : "$pop", { $match : {totalpop : { $gte : 10*1000*1000 ) npr. izgleda rezultata: { "_id" : "AK", "totalpop" t " : 550043 SELECT state, SUM(pop) AS totalpop FROM zipstate GROUP BY state HAVING SUM(pop) > (10*1000*1000)

skript koji vraća prosečan broj stanovnika u gradovima na nivou svake države db.zipcodes.aggregate( { $group : { _id : { state : "$state", city : "$city", pop : { $sum : "$pop", { $group : { _id : "$ $_id.state state", ) avgcitypop : { $avg : "$pop" npr. izgleda rezultata nakon prvog grupisanja: { "_id" : { "state" : "CO", "city" : "EDGEWATER", npr. izgleda rezultata nakon drugog grupisanja: "pop" : 13154 { "_id" : "MN", "avgcitypop" : 5335

skript koji na nivou svake od država vraća gradove koji imaju najveći i najmanji broj stanovnika db.zipcodes.aggregate( { $group: {_id:{ state: "$state", city: "$city", pop: { $sum: "$pop", { $sort: { pop: 1, { $group: { _id : "$ $_id.state id.state", biggestcity: { $last: "$_id.city", biggestpop: { $last: "$pop", smallestcity: { $first: "$_id.city", smallestpop: { $first: "$pop", // $project je opcioni // modifikuje izlazni format { $project: { _id: 0, state: "$_id", biggestcity: { name: "$biggestcity", pop: "$biggestpop", smallestcity: { name: "$smallestcity", pop: "$smallestpop" )

kolekcija users sa informacijama o članovima sportskog udruženja: { _id : "jane", joined : ISODate("2011 03 02"), likes : ["golf", "racquetball"] {_id : "joe", joined : ISODate("2012 07 02"), likes : ["tennis", "golf", "swimming"] skript koji vraća imena članova napisana velikim slovima, sor ranim po alfabetu db.users.aggregate( [ { $project : { name:{$toupper:"$_id", _id:0, { $sort : { name : 1 ] ) skript koji za svakog člana vraća mesec učlanjenja i ime, sor ranim po mesecu db.users.aggregate([ { $project : { month_joined : {$month: "$joined", ) ] name : "$_id", _id : 0, { $sort : { month_joined : 1

skript koji vraća informaciju o tome koliko je novih članova bilo po mesecima db.users.aggregate([ { $project : { month_joined : { $month : "$joined", { $group : { _id : {month_joined: joined:"$month_joined joined", number : { $sum : 1, { $sort : { "_id.month_joined" : 1 ] ) skript koji vraća pet najčešće omiljenih sportova db.users.aggregate([ { $unwind : "$likes", { $group : {_id:"$likes", number : { $sum : 1, { $sort : { number : 1, { $limit : 5 ] ) { _id : "jane", joined : ISODate("2011 03 02"), likes : ["golf", "racquetball"] { { _id : "jane", joined : ISODate("2011 03 02"), likes : "golf" _id : "jane", joined : ISODate("2011 03 02"), likes : "racquetball"