<    March 2017    >
Su Mo Tu We Th Fr Sa  
          1  2  3  4  
 5  6  7  8  9 10 11  
12 13 14 15 16 17 18  
19 20 21 22 23 24 25  
26 27 28 29 30 31
00:07 svm_invictvs joined
00:40 heedypo joined
00:42 raspado joined
00:43 heedypo left
00:47 harry1 joined
00:59 Sasazuka joined
01:48 yhf joined
01:49 yhf joined
01:51 yhf joined
01:52 yhf joined
01:54 yhf_ joined
01:57 yhf joined
02:19 goldfish joined
02:52 Soopaman joined
02:52 <Soopaman> greetings all
02:54 <Soopaman> is there an easier way to gather all of the fields that are ObjectId references?
02:54 gentunian joined
02:55 <Soopaman> (whatever the foreign key equivalent is in datastore)
03:23 nocturne777 joined
03:24 raspado joined
03:28 hwrdprkns joined
03:28 realisation joined
03:29 goldfish joined
03:56 kashike joined
04:02 philipballew joined
04:20 kyuwonchoi joined
04:44 preludedrew joined
04:48 castlelore joined
04:48 castlelore joined
04:54 ayogi joined
05:05 artok joined
06:02 igniting joined
06:21 lpin joined
06:33 kexmex joined
06:36 sfa joined
06:39 philipballew joined
06:42 freeport joined
06:52 rendar joined
07:10 quattro_ joined
07:15 coudenysj joined
07:18 ayrus joined
07:19 <ayrus> Hi, How to search in collection when collection name is having "|" in the name. Like db.test|data_op.findOne(); how to escape pipe. Error "E QUERY [thread1] ReferenceError: data_op is not defined :"
07:27 nanohest joined
07:53 amitprakash joined
07:55 <amitprakash> Hi, is it possible to launch mongod in hidden replica mode via config file?
08:31 silenced joined
08:40 Lujeni joined
08:41 quattro_ joined
08:57 evil_gordita joined
09:01 Soopaman joined
09:17 jwd joined
09:26 kexmex joined
09:35 freeport joined
09:35 VerdeRnel joined
09:43 mxmxmx joined
09:43 mxmxmx left
09:51 SkyRocknRoll joined
09:57 silenced joined
10:02 VeeWee joined
10:06 undertuga joined
10:07 ayrus left
10:10 jamesaxl joined
10:21 dgdekoning joined
10:24 jokke joined
10:25 <jokke> hey
10:26 <jokke> i get the following error in an application we unfortunately used pretty nested mongodb documents in: failed with error 16837: "cannot use the part (2 of sections.4.page_modules.2._id) to traverse the element ({2: null})"
10:26 <jokke> any ideas what this means?
10:50 coudenysj joined
10:55 <VerdeRnel> jokke: cannot help you but I hope you will solve your problem :)
10:55 <jokke> thanks :)
10:57 coudenysj joined
11:01 g-n0m3 joined
11:02 <VerdeRnel> jokke: I just started this morning tbh :D
11:03 <jokke> then a word of advice: don't use nested documents more than one level deep if it's a "has_many" or rather "embeds_many" type of thing
11:04 <jokke> there's a bug regarding problems that arise from this and it hasn't been fixed in years
11:05 <VerdeRnel> ok, don't know what nested document is but I'll see later
11:05 <jokke> { this: { is: { a: { deeply: { nested: 'document' } } } } }
11:05 <jokke> but that'd be fine ^
11:06 <VerdeRnel> ah
11:06 <VerdeRnel> ok
11:06 <jokke> { this: [{ wouldnt: [{ be: 'fine' }] }] }
11:07 <jokke> not with the positional operator
11:07 <jokke> (you don't need to know what that is just yet
11:07 <jokke> )
11:08 <jokke> basically mongodb lets you update deeply nested documents
11:09 <jokke> if you'd want to update 'fine' to say 'great', you could do it like this: update({ $set: { 'this.0.wouldnt.0.be': 'great' } })
11:09 <VerdeRnel> okay
11:09 <jokke> the zeros are the index of the document you're accessing
11:10 <VerdeRnel> yes
11:10 amitprakash left
11:11 <jokke> but there's the positional operator '$' which acts as a placeholder for these indexes (or rather one of them) and is substituted by the index for which the referenced document maches the query
11:11 <jokke> with multiple levels of nesting this goes terribly wrong
11:12 <jokke> which is what i'm suspecting is happening in my case
11:12 <VerdeRnel> mmh :/
11:13 <VerdeRnel> have to go, good day
11:13 <VerdeRnel> \o
11:13 <jokke> you too
11:13 <jokke> o/
11:13 <VerdeRnel> ty
11:13 YoY joined
11:13 <quattro_> Can anyone tell me why with wiredtiger I get these hourly disk I/O write buildups? http://i.imgur.com/j0CYOKJ.png
11:14 <quattro_> Even when prefilling data in the documents it’s still spiking
11:14 <jokke> wat
11:14 <jokke> why are you prefilling data with wiredtiger?
11:15 <quattro_> I just tried to see if that would stop this from happening but it did not
11:16 basiclaser joined
11:16 <quattro_> It just seems like mongodb keeps writing the whole document when updating a single minute value
11:16 <jokke> oooh
11:16 <jokke> okay
11:17 <jokke> yeah don't do that with wiredtiger
11:17 <jokke> preallocation of time series data is history
11:17 <jokke> just insert a doc per sample
11:18 <quattro_> so I would have to use insert instead of upserts?
11:18 <jokke> yes, but get rid of the structure 0: { 0: ... 59: .. } ... 24: ... if that's what you're using
11:19 <jokke> *23 rather :)
11:19 <jokke> can you post an example document
11:20 <quattro_> the hourly documents made lookups fast for me, also I would go from 32 Million documents to ~2 bllion
11:20 <jokke> yes
11:20 <jokke> that's the way to go
11:20 <jokke> wiredtiger has index compression
11:21 <jokke> we did a rewrite last year of a project just like yours
11:21 <jokke> from mmmap to wiredtiger
11:22 <jokke> with the correct indexes (we use a compound index of timestamp and sensor id) this is not a problem
11:22 <jokke> and you will notice, that the code around this will get alot simpler
11:23 raspado joined
11:23 <jokke> still a lot of work
11:23 <jokke> aggregations will be so much simpler
11:23 <jokke> and can be run in the aggregation framework
11:24 <jokke> which is close to impossible with the hourly docs
11:24 <jokke> docs are always structured the same across aggregations
11:24 <jokke> the benefits are huge
11:24 itaipu joined
11:25 <jokke> atm we have 117099735 docs like this in our db
11:25 <jokke> and no problem whatsoever
11:26 <quattro_> http://pastebin.com/UkRQb4sx here’s a document of cpu usage metric
11:28 <jokke> hm not clear to me from that what's a single sample
11:28 <quattro_> Yeah it would just mean a LOT more document for me, I usually need a couple hours of data so I can just fetch like 6 documents quickly to crea a chart
11:29 <jokke> yeah
11:29 <jokke> no problem with the right index
11:29 <quattro_> So these hourly document will surely not scale well on mongodb
11:30 <jokke> what do you mean?
11:30 <quattro_> It works right now with the amount of updates I’m doing but the more updates the higher the written data will be
11:31 <jokke> sure
11:31 yeitijem joined
11:31 <jokke> but i don't see the problem
11:31 <quattro_> I’m just curious why mongodb writes the whole document to disk when doing an upsert/update
11:32 <quattro_> I was wondering if this is expected behaviour
11:32 <quattro_> or if i’m doing something wrong
11:32 <jokke> you are :)
11:32 <jokke> as i pointed out
11:33 <quattro_> mongodb even recommends these hourly documents for time series on their schema design page, so this is wrong?
11:35 <jokke> yes
11:35 <jokke> it's wrong
11:35 <jokke> outdated
11:36 <jokke> it was true for mmap
11:36 <jokke> it's not for wiredtiger
11:37 <quattro_> How are you doing historical data for example hourly averages then?
11:37 <jokke> you mean aggregates?
11:38 <quattro_> yeah
11:38 <jokke> by using the aggregation framework
11:38 <jokke> *pipeline
11:38 <jokke> range queries for timestamp
11:40 <quattro_> what kind of performance do you get for longer time ranges, for example 6 months of data?
11:40 <jokke> raw data?
11:41 <quattro_> yeah
11:41 <jokke> haven't tried that long ranges
11:41 <quattro_> how about a month? <100ms?
11:41 <jokke> but why would you query 6 months of raw data? :D
11:42 <jokke> also you can't directly compare our results, because we store a lot of data in the documents
11:43 <jokke> here's an example: https://p.jreinert.com/U0lS/
11:45 <quattro_> You would just store all metrics in a single for example minute document and then use projection to get the metric you want?
11:46 <jokke> yes. the interval is interchangeable
11:46 <jokke> we have loggers that send data every 2s
11:47 <jokke> so that bunch is stored in a single doc every 2s
11:47 <jokke> aggregated docs look just the same
11:47 <jokke> that's a huge advantage when you write your application
11:48 <jokke> you can treat it exactly the same as raw data
11:48 <quattro_> I always thought it was best to process the data in your application instead of using mongodb’s aggregation framework
11:49 <jokke> ?
11:49 <jokke> why?
11:49 <jokke> i mean.. the aggregation framework isn't pretty, i have to admit
11:49 <jokke> but it get's the job done, and fast, and using concurrency on shards.
11:50 <quattro_> Scaling application servers is much easier than mongodb tho
11:50 <jokke> we have an aggregator running which aggregates docs from raw to 1 min, from 1 min to 15 min, from 15 min to 1h and from 1h to a day
11:50 <jokke> how come?
11:50 <jokke> adding a shard to an already sharded setup is a piece of cake
11:51 <jokke> the aggregator stores the docs in separate collections
11:51 <jokke> in the application we have a simple algo which pics the most suitable aggregation level for a query
11:51 <jokke> *picks
11:52 <jokke> quattro_: and still you'r mongodb could become the bottleneck while you're scaling app servers
11:53 <quattro_> mongodb is fast, even for very long data ranges (2 years), I’m just worried about how the updates are being done
11:53 <jokke> the _sharding_hash field you see in the example doc there is used for evenly distributing docs across shards
11:55 <jokke> i don't know that much about the internals of wiredtiger. i'd suspect your io spikes come from processing the journal though (and only then writing the docs to the actual db)
11:55 <jokke> but that's a wild guess
11:59 <quattro_> for now instead of upserting into hourly document I’ll try inserting to minute documents with the same structure and see what it’s like
12:05 <jokke> okay
12:05 ssarah joined
12:21 StephenLynx joined
12:23 <quattro_> jokke: initial impression seems like disk writes are somewhat stabalizing :) I’ll see in a couple hours if fetching graphs is slower
12:29 RickDeckard joined
12:30 <jokke> assuming your indexes are in order it should be fast
12:39 Mr joined
12:42 Capkirk joined
12:43 <Capkirk> will all changes to multiple databases in a replica cluster be written to oplog if its enabled?
12:52 compeman_ joined
13:00 geoffb joined
13:00 dantti joined
13:01 ramortegui joined
13:01 dantti left
13:02 <Capkirk> will all changes to multiple databases in a replica cluster be written to oplog if its enabled?
13:22 nanohest_ joined
13:25 sfa joined
13:30 re1 joined
13:30 jeffreylevesque joined
13:31 nanohest joined
13:52 lessthan_jake joined
13:53 iceden joined
13:54 nanohest joined
13:56 YoY joined
13:57 itaipu joined
13:57 nanohest joined
14:01 Folkol joined
14:07 yhf_ joined
14:19 lessthan_jake joined
14:24 Folkol joined
14:34 lessthan_jake joined
14:39 nanohest joined
14:40 compeman_ joined
14:41 nanohest joined
14:42 freeport joined
14:51 silenced joined
14:53 lessthan_jake joined
14:55 artok joined
14:56 philipballew joined
15:11 artok joined
15:13 compeman_ joined
15:16 re1 joined
15:17 synchroack joined
15:17 re1 joined
15:18 re1 joined
15:20 re1 joined
15:21 re1 joined
15:21 jamesaxl joined
15:23 synchroack joined
15:37 re1 joined
16:03 orbyt_ joined
16:12 lessthan_jake joined
16:24 artok joined
16:31 svm_invictvs joined
16:33 synchroack joined
16:35 raspado joined
16:36 leo_ joined
16:37 <leo_> Hey everyone, I'm interning for a company and I've been tasked on designing something simmilar to "airtable.com". My question is would this be a good use of mongodb?
16:38 <leo_> The basic concept is allowing users to create their own "table" of which they can specify what to put in there
16:38 <artok> that seems like trello kind of kanban board
16:38 raspado_ joined
16:38 <leo_> It's like a spreadsheet
16:39 <leo_> artok: Never heard of that before
16:39 <artok> not that kind of spreadsheet
16:42 <artok> but yes, I'd do those using mongodb
16:42 <leo_> artok: lol, I'm thinking the same. I just have to convince my boss I guess
16:43 <artok> tries to build that on top of sql?
16:43 <leo_> Yeah. He wants me to use postgres and django
16:43 <leo_> I have a system built that auto generates new database tables to a user specs
16:44 <leo_> But, the difficulty is actually using those tables in a way where I don't have to barf out raw sql in django all the time
16:44 <leo_> I may be mistaken, but I think thats something that would grow out of hand really quickly. Maintaining it would become a headache I think
16:44 <artok> well that would be yes
16:44 <artok> that's why: mongo
16:46 <leo_> Yeah. I'll see what he says.
16:50 compeman_ joined
16:51 Soopaman joined
16:52 silenced joined
16:55 cihangir joined
17:03 InfoTest joined
17:05 synchroack_ joined
17:06 synchroack joined
17:08 compeman_ joined
17:09 compema__ joined
17:09 pzp joined
17:10 Necromantic joined
17:19 point joined
17:33 gitgud joined
17:45 kexmex joined
17:52 Soopaman joined
18:00 jamesaxl joined
18:10 devster31 joined
18:11 nanohest joined
18:13 Sasazuka joined
18:20 artok joined
18:27 artok joined
18:36 radhe joined
18:39 Soopaman joined
18:42 <radhe> Hey
18:42 <radhe> I have a mongo cluster of 3 nodes. I've the initialised the replicaSet with DNS. Now, I want to connect to this cluster with official node client through CNAME for that DNS. I can not connect. Any words?
18:42 <radhe> Nodejs Client*
18:43 <radhe> In simple terms, How to work with Nodejs clients if using CNAMEs of Mongodb Cluster.
18:47 gentunian joined
18:54 radhe joined
19:10 ironpig joined
19:16 synchroack joined
19:21 <ironpig> is there any advantage in setting singlefield indexes for each value of two part and query compared to unindexed values : find({$and:[{'value1':'X'},{'value2':'Y'}]}), such that singlefield indexes exist for value1 & value2 are set, but not in a combined index.
19:23 rendar joined
19:23 rendar joined
19:24 Lujeni joined
19:27 chris| joined
19:39 <gitgud> seems no one is around
19:40 <gitgud> ironpig, ok hello are you there?
19:40 <gitgud> i'll bite
19:40 artok joined
19:42 Silenced_v2 joined
19:45 artok joined
19:46 okapi joined
19:56 nanohest joined
20:00 InfoTest joined
20:00 <ironpig> gitgud, yes I'm here.
20:01 <gitgud> ironpig, ok so i did some tests to replicate your question
20:01 <gitgud> and then hit explain()
20:02 <ironpig> ok
20:02 <gitgud> i had 2 fields. "firstName" and "lastName" and made indexes on both. (compound indexing). the firstName was before the last in the compound index. then i searched thru them and i found that when you do firstName search or firstName + lastName search the indexes are being used. as advertised
20:03 gentunian joined
20:03 <gitgud> so i delete that index and add 2 individual single index
20:03 <ironpig> ok
20:03 <gitgud> one on firstName and one on the lastName. and then i did a compound search of firstName + lastName, and found out that the lastName index is the only one being used
20:04 <ironpig> huh, interesting...
20:04 <gitgud> when you do individual search, the individual indexes are always being used though. as you would expect
20:04 blizzow joined
20:05 <gitgud> pastebin.com/4PwSdCy9
20:05 <gitgud> as you can see the stage is IXSCAN and the only index being used is lastName, despite there being a firstName index in there
20:06 <ironpig> so, maybe I'm wrong but... I've been thing that if you have a compound search with values, the query would use both indexes, like it uses two indexes. I'm assuming a compound index will always be faster, but I'm wondering if using two indexes happens inside a $and query.
20:07 <gitgud> ironpig, if you have compound index on value1+value2 and then you do an $and search on value1 and $value2, then the compound index shall be used
20:07 <gitgud> all comma separated searches are actually translated into $and anyway
20:07 realisation joined
20:08 <gitgud> now if you are having single indexes on *each* field, and then you do a compound search, only one of them will be used, as you can see from my experiment
20:08 <gitgud> pastebin.com/DeaCqz6N
20:08 <ironpig> thnak you : )
20:09 <gitgud> ironpig, look at that link
20:09 <gitgud> as you can see it detects the presence of the other index, look in the rejectedPlans section
20:09 <gitgud> but it still uses 1 index in the winningPlans section
20:09 <gitgud> so its still at least using an index :)
20:09 <gitgud> anyway good luck
20:10 <ironpig> :)
20:10 <ironpig> this has been very helpful!
20:10 philipballew joined
20:20 artok joined
20:24 Sasazuka_ joined
20:30 jwd joined
21:05 Sasazuka_ joined
21:12 synchroack joined
21:25 felixjet joined
21:28 coudenysj joined
21:29 culthero joined
21:29 klics joined
21:31 compeman_ joined
21:38 jeffreylevesque joined
21:52 Soopaman joined
22:00 Siegfried joined
22:24 StephenLynx joined
22:37 realisation joined
22:45 synchroack joined
23:08 synchroack joined
23:18 synchroack joined
23:23 point_ joined
23:29 Sasazuka joined
23:37 synchroack joined
23:53 synchroack joined