MongoDB

from Wikipedia, the free encyclopedia
MongoDB

MongoDB Logo.svg
Basic data

developer MongoDB, Inc.
Publishing year 2009
Current  version 4.2.0
( August 9, 2019 )
operating system Linux , macOS , Microsoft Windows , OpenBSD
programming language C ++ , Go , JavaScript , C , Python
category Document-oriented database
License Server Side Public License
www.mongodb.com

MongoDB (derived from the English hu mongo us , "gigantic") is a document - oriented NoSQL database management system written in the C ++ programming language . Since the database is document-oriented, it can manage collections of JSON- like documents. This allows many applications to model data in a more natural way, as the data can be nested in complex hierarchies, but can always be queried and indexed.

The development of MongoDB began in October 2007 by the company 10gen , which was renamed MongoDB, Inc. on August 27, 2013 . The first publication took place in February 2009. MongoDB was published as open source until October 15, 2018 and has been available under the proprietary SSPL since then . It is the most widely used NoSQL database (as of October 2019).

System requirements

The binaries are available for Windows , Linux , macOS, and Solaris . MongoDB can be compiled on almost any little endian system .

construction

Databases

A MongoDB process can manage multiple databases, and a database can contain multiple collections . The database and collection, separated by a period, form a namespace . For a database that should manage the data of a company and a collection that should contain all employees, one could choose the namespace firma.mitarbeiter, for example .

Collections

A collection contains documents and can be compared to a table in a relational database. An essential difference is that the documents in a collection can be structured completely differently. They do not have to follow a scheme , nor do the values ​​of the same key have to be of the same data type .

Capped collections

MongoDB supports size-limited document collections - also called capped collections . A capped collection is created with a certain size and - depending on requirements - a number of elements. A capped collection is the only kind of document collection that keeps order: as soon as the specified size is reached, the capped collection behaves like a digital ring buffer .

A special type of cursor - called a tailable cursor  - can be used in capped collections. The cursor was named after the Unix command tail -f. It does not disappear as soon as it finishes playing the results, but waits and plays new results as soon as new documents are added to the collection.

System Collections

MongoDB automatically creates system collections . One of the system collections contains all the indexes of the database, another all namespaces , a JavaScript code and the other information on profiling and users.

Interrogate

Mongo allows you to query any field at any time. Mongo also supports range queries, regular expression searches, and other specialty search queries in addition to example searches. These queries also include custom JavaScript functions. Queries can return specific document fields (instead of the entire document) as well as sort, skip and restrict results. Queries can reach into embedded objects and arrangements.

Each query result is provided as a cursor .

indexing

The software supports index structures such as B-trees and geospatial indexes. Nested fields (as described above in the ad hoc query) can also be indexed. Indexing lists results in an indexing of each individual element of the list.

The query optimizer MongoDB examined during a query run autonomously between different evaluation plans and selects the fastest; a sample repetition takes place periodically . Developers can explainsee the index used using the function and hintselect a different index using the function .

Indices can be created and deleted at any time.

MongoDB's indices lack support for the usual alphabetical sorting of umlauts and other Unicode characters .

Aggregation

In addition to ad hoc queries, the database also supports other tools for aggregation including MapReduce and a grouping function similar to SQL's GROUP BY.

GridFS

GridFS ("Grid File System") can be used to store documents that exceed the size limit of 16 MB. This file storage mechanism was used in plug-ins for Apache , nginx and lighttpd .

Differentiation from relational (SQL) databases

MongoDB is classified as a NoSQL database and as such differs from traditional databases in that it offers a less powerful query language. This is both a disadvantage and an advantage: On the one hand, more logic must be available in the application layer in order to achieve the same results as with SQL databases. On the other hand, MongoDB can distribute the database and the workload over several servers, which is not possible in monolithic SQL databases. Nevertheless, there are now also SQL databases such as B. Exasol or Greenplum, which are distributed over several servers. Large join operations can only be accomplished in a reasonable time on the multi-server systems .

MongoDB aims at distributing the data over several servers to increase the availability through replication and to distribute the work and data load through sharding (see below). However, replication has another disadvantage: If a write access is confirmed by MongoDB, there is a standard time window in which subsequent read accesses return the old data. This consistency model is known as eventual consistency .

Another feature that distinguishes MongoDB from relational databases is the freedom from schema. While the structure of a database entry in relational databases is fixed by the definition of the table, the database entries in MongoDB can be freely different from one another (even if they belong to the same collection ). This freedom is granted that it supports agile software development, since it is easier to react to changing requirements.

However, the information must then also be structured during analyzes.

Differentiation from other NoSQL databases

The CAP theorem is often used to classify databases on the basis of their quality characteristics . The CAP theorem states that in the case of network partitioning, a distributed system must decide whether it should remain available or whether it should guarantee consistency. MongoDB opts for consistency here, but can maintain availability as long as the majority of the nodes in a replica set can communicate with one another. CouchDB as a comparison to MongoDB with similar functionalities puts the availability over the consistency.

administrative tools

Official tools

A connection can be established in various ways with a running MongoDB server. The Mongo Shell is included in the distribution. An HTTP-based administration interface and a REST interface can be called up in a browser after prior activation. Finally, there are drivers for numerous programming languages ​​available to programmers in order to implement the communication of their applications with MongoDB.

Mongo Shell

The Mongo Shell is a command line - client . It is used for the administration of MongoDB and enables its user to read and write operations. To do this, you get a prompt on which you can execute commands in the JavaScript language .

driver

MongoDB comes with official drivers for C , C ++ , C # , Haskell , Java , JavaScript , Lisp , Perl , PHP , Python , Ruby and Scala .

There are also officially supported ORMs for MongoDB for some programming languages , such as B. Mongoose for the Node.js platform.

Cloud-based monitoring service

MongoDB Management Service (MMS) is a cloud-based monitoring solution and alert service for MongoDB servers.

Graphic interfaces

There are several graphical interfaces ( GUIs ) for viewing and processing the data. This includes:

Surname description License Linux Windows Mac
Studio 3T (formerly MongoChef) a cross-platform MongoDB GUI free license and proprietary Yes Yes Yes
Nucleon BI Studio Business Intelligence Frontend for MongoDB proprietary Yes
Fang of Mongo a web-based UI created with Django and jQuery GNU AGPL v3.0 Yes
Nucleon Database Master a Windows-based database client software that also supports RDMS proprietary Yes
Futon4Mongo a clone of the CouchDB futon web interface for MongoDB
mms, Mongo Management Studio both cross-platform and web-based GUI free license and proprietary Yes Yes Yes
Mongo3 a Ruby-based interface Apache License 2.0 Yes Yes Yes
MongoHub a native OS X application for the management of MongoDB, inactive since April 2015 No No Yes
Opricot a browser-based MongoDB shell written with PHP GNU GPL v3.0 Yes
Robo 3T (formerly Robomongo) a cross-platform MongoDB GUI GNU GPL v3.0 Yes Yes Yes
UMongo (JMongoBrowser) a cross-platform management GUI written in Java various open source licenses
DBHawk a web-based MongoDB tool proprietary Yes Yes Yes

Replication

MongoDB offers two types of replication to compensate for failures of individual servers and to distribute the load of read accesses to several servers :

Master-slave replication

The master-slave replication is outdated, but still available. A master can carry out reads (" reads ") and write accesses ("writes"). A slave copies the data from the master and can only be used for read access or data backup, but not for write access.

MongoDB allows developers to guarantee that at least one process has been replicated to N servers on a per-run basis .

Replica sets

Replica sets are similar to the master-slave relationship, but include the ability for the slaves to elect a new master if the current one fails.

Sharding

MongoDB scales horizontally using a system called horizontal fragmentation, which is very similar to the Bigtable and PNUTS scaling systems. The developer chooses a fragmentation key that determines how the data is distributed in an aggregate. The data is divided into areas (based on the fragmentation key) and distributed over several instances.

The application or its developer must know that communication takes place with a fragmented cluster in certain processes . For example, a "findAndModify" query must contain the fragmentation key if the requested collection is horizontally fragmented. The application communicates using a special routing process called "mongos" that looks just like a single MongoDB server. This "mongos" process knows which data is managed by which instance and routes the query accordingly. All queries flow through this process: This not only forwards the queries and answers, but also carries out all the necessary, final data interlinking and disentanglement. Any number of "mongos" processes can be started, but usually only one per application server is recommended.

technical basics

In the case of read and write access, the data is initially stored in the RAM and is only synchronized on disk by the mmap operating system service after a certain time (by default every 60 seconds) . This results in a speed advantage, since the RAM can be accessed in nanoseconds, whereas files can be accessed in the three-digit millisecond range. One disadvantage is that in the event of a server crash, for example, all data that is only in RAM is lost. MongoDB counteracts this disadvantage with the journaling method.

Due to the use of mmap , the data size is limited to 2 GB on 32-bit computers ( the upper limit is correspondingly much higher for 64-bit computers). The MongoDB server can only be used on little-endian systems, although most drivers run on both little-endian and big-endian systems.

Further features are:

  • UTF-8 encoding of the documents. Non-UTF-8 data can be stored, requested and retrieved using a special binary data type.
  • Support for dates, regular expressions , code and binary data (all BSON categories).
  • Server-side JavaScript execution: JavaScript is the lingua franca of MongoDB and can query and aggregation functions (such as MapReduce ) are used also JavaScript can be sent directly to the database and be executed there.

Licensing and Support

MongoDB was freely available under the GNU Affero General Public License until October 2018 . The language drivers are available under an Apache license .

In October 2018, the developers of the MongoDB database switched to the proprietary Server Side Public License (SSPL) so that cloud providers would not use the database without returning code.
The SSPL requires that everyone who offers the MongoDB service also publishes the source code of the service under this license, including the code of all programs for management, for user interfaces, for monitoring and for backups. MongoDB submitted the license of the Open Source Initiative (OSI), where it was rejected. A new version 2 of the license was submitted again to the OSI, but then withdrawn after it became clear that it would not be accepted. MongoDB is currently only available under the deprecated version 1. This was preceded by a similar license change by developers of the Redis database .

Due to the license change, MongoDB has been removed from the Linux distributions Debian , Fedora and Red Hat Enterprise Linux . The Fedora project decided that SSPL version 1 is not a free software license .

safety

Numerous MongoDB installations on the Internet can be read and in some cases even written by anyone. A search with Shodan returned 52,000 open databases in January 2017.

The reason is that no access control is configured in the standard installation. If the database is later moved to a public server and the configuration is not adjusted, the data can be freely accessed from outside. Part of this was exploited by ransomware that encrypted this data. The manufacturer of the commercial version has long been recommending measures for protection.

Known users

literature

Web links

References and comments

  1. Release 4.2.0 . August 9, 2019 (accessed August 10, 2019).
  2. Release Notes . August 9, 2019 (accessed October 3, 2019).
  3. ^ Languages . (English, accessed August 6, 2018).
  4. www.mongodb.com . (accessed on November 13, 2018).
  5. jira.mongodb.org . (accessed on November 13, 2018).
  6. ^ MongoDB. (No longer available online.) Archived from the original on December 3, 2013 ; Retrieved November 20, 2013 .
  7. MongoDB website. Retrieved June 22, 2012 .
  8. 10gen Announces Company Name Change to MongoDB, Inc. August 27, 2013, accessed on August 28, 2013.
  9. MongoDB Blog - March 2010. Retrieved June 22, 2012 .
  10. MongoDB now released under the Server Side Public License. Retrieved November 29, 2018 .
  11. DB Engines Ranking. Retrieved October 14, 2019 .
  12. Capped collections. Retrieved June 22, 2012 .
  13. Tailable cursors. Retrieved June 22, 2012 .
  14. ^ MongoDB Find Command. Retrieved October 16, 2016 .
  15. ^ Geospatial indexes. Retrieved June 22, 2012 .
  16. Derick Rethans: Natural Language Sorting with MongoDB. August 26, 2014, accessed October 26, 2015 .
  17. Sort by collation. In: MongoDB's Jira. October 11, 2010, accessed October 26, 2015 .
  18. GridFS. Retrieved January 15, 2016 .
  19. mod_gridfs. Retrieved November 19, 2014 .
  20. nginx. Retrieved June 22, 2012 .
  21. lighttpd. Retrieved June 22, 2012 .
  22. Advantages and disadvantages of MongoDB. Retrieved January 30, 2016 .
  23. Why freedom from schema? Retrieved January 30, 2016 .
  24. ^ CAP theorem. Retrieved January 30, 2016 .
  25. Drivers. Retrieved June 22, 2012 .
  26. Node.js MongoDB Driver docs.mongodb.org
  27. Mongo Cloud , accessed October 17, 2015
  28. Studio 3T (formerly MongoChef) - The IDE for MongoDB. Retrieved October 1, 2014 .
  29. Fang of Mongo. Retrieved June 22, 2012 .
  30. Fang-of-Mongo / LICENSE at fom_object Fiedzia / Fang-of-Mongo - GitHub
  31. Futon4Mongo. Retrieved June 22, 2012 .
  32. Mongo Management Studio. Retrieved September 1, 2014 .
  33. Mongo3. Retrieved June 22, 2012 .
  34. MongoHub. Archived from the original on February 7, 2015 ; Retrieved June 22, 2012 .
  35. jeromelebel / MongoHub-Mac. Retrieved August 30, 2018 .
  36. Opricot. Retrieved June 22, 2012 .
  37. robomongo. Retrieved January 10, 2018 .
  38. robomongo / LICENSE at master paralect / robomongo - GitHub
  39. UMongo. Retrieved June 22, 2012 .
  40. umongo / README.rst at master agirbal / umongo - GitHub
  41. DBHawk. Retrieved January 6, 2018 .
  42. Tobias Trelle: MongoDB. The practical introduction , dpunkt, Heidelberg 2014, p. 21.
  43. 32-bit limitations. Retrieved June 22, 2012 .
  44. ^ The AGPL - MongoDB Blog: May 5, 2009. Retrieved June 22, 2012 .
  45. [License-review] Approval: Server Side Public License, Version 1 (SSPL v1). Retrieved November 29, 2018 .
  46. [License review] Approval: Server Side Public License, Version 2 (SSPL v2). Retrieved November 29, 2018 .
  47. [License review] Approval: Server Side Public License, Version 2 (SSPL v2). Retrieved March 22, 2019 .
  48. Hanno Böck: MongoDB changes license
  49. Steven J. Vaughan-Nichols: MongoDB "open-source" Server Side Public License rejected ( en )
  50. MongoDB's licensing changes led Red Hat to drop the database from the latest version of its server OS ( en-US ) January 16, 2019.
  51. Extortionists Wipe Thousands of Databases, Victims Who Pay Up Get Stiffed , Krebs on Security, January 10, 2017
  52. a b MongoDB: Securing unprotected databases. February 18, 2015, accessed February 18, 2015 .
  53. MongoDB ransacking  ( page no longer available , search in web archivesInfo: The link was automatically marked as defective. Please check the link according to the instructions and then remove this notice.@1@ 2Template: Dead Link / docs.g00gle.com  
  54. Guide to hedge: How to secure MongoDB on Linux or Unix production server nixCraft, January 9, 2017
  55. ^ MongoDB Powering MTV's Web Properties . May 10, 2011. Retrieved July 6, 2011.
  56. ^ Disney Central Services Storage: Leveraging Knowledge and Skillsets . May 24, 2011. Archived from the original on June 11, 2011. Retrieved July 6, 2011.
  57. MongoDB at foursquare - Presentation at MongoNYC . May 21, 2010. Archived from the original on June 12, 2010. Info: The archive link was automatically inserted and not yet checked. Please check the original and archive link according to the instructions and then remove this notice. Retrieved June 28, 2010. @1@ 2Template: Webachiv / IABot / blip.tv
  58. Jacqueline Maher: Building a Better Submission Form , NYTimes Open Blog. May 25, 2010. Retrieved June 28, 2010. 
  59. How Python, TurboGears, and MongoDB are Transforming SourceForge.net . PyCon 2010. February 20, 2010. Archived from the original on August 19, 2010. Retrieved June 22, 2012.
  60. MongoDB at Etsy . Code as Craft: Etsy Developer Blog. May 19, 2010. Retrieved June 28, 2010.
  61. Holy Large Hadron Collider, Batman! . The MongoDB NoSQL Database Blog. June 3, 2010. Retrieved August 3, 2010.
  62. AppScale - Supported Datastores. (No longer available online.) Archived from the original on September 7, 2013 ; Retrieved June 22, 2012 .