<     May 2017     >
Su Mo Tu We Th Fr Sa  
    1  2  3  4  5  6  
 7  8  9 10 11 12 13  
14 15 16 17 18 19 20  
21 22 23 24 25 26 27  
28 29 30 31
00:00 GitHub99 joined
00:00 <GitHub99> [13sequel] 15jeremyevans pushed 2 new commits to 06master: 02https://git.io/v9AZ0
00:00 <GitHub99> 13sequel/06master 145280d40 15Steven Cregan: Handling for Oracle 11g xe schema[:db_type]...
00:00 <GitHub99> 13sequel/06master 1430c7abd 15Jeremy Evans: Fix schema_dump handling of not null in database type on Oracle (Fixes #1351)
00:00 GitHub99 left
00:00 GitHub174 joined
00:00 <GitHub174> [13sequel] 15jeremyevans closed pull request #1351: Bugfix: Handling for Oracle 11g xe schema[:db_type] (06master...06master) 02https://git.io/v96RL
00:00 GitHub174 left
00:17 GitHub89 joined
00:17 <GitHub89> [13sequel] 15jeremyevans pushed 1 new commit to 06master: 02https://git.io/v9Ana
00:17 <GitHub89> 13sequel/06master 147a0656c 15Jeremy Evans: Make Database#with_server in the server_block extension accept a second argument for a different read_only shard (Fixes #1355)...
00:17 GitHub89 left
00:18 glennpratt joined
00:27 glennpratt joined
08:56 aidalgol joined
11:32 GitHub0 joined
11:32 <GitHub0> [13sequel] 15StevenCregan commented on issue #1351: My sincere apologies, I really don't know Ruby and didn't quite know what I was doing. I reverted simply so I could use my temporary solution; I should have closed this instead of creating clutter.... 02https://git.io/v9xLT
11:32 GitHub0 left
11:50 segfaulty joined
11:54 <segfaulty> while going through the some of the source to be able to monkeypatch some things that I need I noticed there is quite a lot of room for optimization, wish I had time to contribute to this project, maybe I will be able to one day
11:56 <segfaulty> I go to extreme lengths to optimize some of my libraries and even third party gems occasionally, using techniques like precompiling shortcuts with code generation, but there is quite a bit of low hanging fruit that I noticed in sequel which can be optimized without destroying the quality of the code base
11:58 <segfaulty> things like options being stored in hashes introducing lots of lookup overhead instead of using instance variables, and blocks being captured just to be passed to another method
11:59 <segfaulty> `def foo; bar { yield } end` a lot faster than `def foo(&block); bar(&block) end`
12:07 <Caesium> https://www.reddit.com/r/ruby/comments/2x9axs/why_blocks_make_ruby_methods_439_slower/
12:07 <Caesium> doesn't seem worth it
12:07 <Caesium> 439% sounds like a big number at first, but given they're working on "fixing" it apparently, and it really makes such trivial differences to times
12:08 <Caesium> are there really lots of places that Sequel does it, that don't use the additional flexibility leveraged by convering it to a Proc?
12:08 <Caesium> some claims that MRI 2.3 fixed it, but doesn't seem to be the case in my quick benchmark test on 2.3.1
12:09 <Caesium> be interested in jeremyevans's view on this one :)
12:12 <segfaulty> in my experience, the overhead of capture blocks can make a big difference if called enough which things like Sequel.synchronize probably are
12:13 <segfaulty> I haven't looked at a lot of the code bases, but basically all the cases that I came across were unnecessarily capturing blocks
12:13 <segfaulty> the only time you need to capture a block is if it needs to be stored to a variable
12:14 <segfaulty> ruby will never be able to optimize captured blocks more, because they need to track a lot of state and allocate an option for it
12:14 <segfaulty> *much more
12:14 <segfaulty> *allocate an object
12:22 <segfaulty> I do a lot of work on high performance applications in far faster languages than Ruby, so I am painfully aware of how slow Ruby actually is, and do a lot of profiling to be able to implement things with as little unnecessary overhead as possible
12:23 <segfaulty> so when I look at the source of sequel, or literally any other gem, my OCD goes through the roof
12:23 <segfaulty> I go as far as writing C extensions to optimize my own core libraries, but there is a crazy amount of unnecessary overhead in almost all gems out there
12:24 <segfaulty> I end up monkeypatching almost every gem at some point to optimize some code path
12:36 GitHub86 joined
12:36 <GitHub86> [13sequel] 15jodosha opened issue #1361: URI encode PG connection string before to URI.parse 02https://git.io/v9xZy
12:36 GitHub86 left
12:39 <segfaulty> there are probably a lot of places where named parameters could be used instead of option hashes, that would save on tons of slow hash table lookups. there are also things like `h.merge!(key: value)` being used instead of `h[:key] = value` which would save an unnecessary hash allocation and multiple lookups
12:43 <Freaky> it's easy to pick nits like this, but they're not very compelling unless they're shown to actually measurably affect performance :)
12:45 <segfaulty> in a language that is as slow as ruby, it adds up surprisingly quickly
12:45 <segfaulty> I recently profiled the Beefcake gem, monkeypatched it with a bunch of optimizations and managed to speed it up by 350%
12:46 <segfaulty> both encoding and decoding
12:55 <Caesium> did they get accepted upstream or was it a bit gnarly?
12:56 <Caesium> there's always a line to draw between speed and readability, especially in Ruby.. if you want speed at all costs maybe you're in the wrong language ;)
12:57 <segfaulty> it was just monkeypatched for a client, who unfortunately isn't going to open source it, I'll probably create a proper patch for upstream one day when I have time
12:58 <segfaulty> sadly I don't have the time to contribute to open source projects, as much as I'd like to
12:58 <segfaulty> like I said earlier, I go to extreme lengths to optimize libraries in many cases, using code rewriting and even C extensions
12:59 <segfaulty> but there is always a lot of room for optimization in plain ruby code, without destroying the quality of the code base, or increasing complexity too much
13:02 <Freaky> what's Sequel 5 targetting, 1.9+? Would mean keyword argument are still verboten
13:03 <* Freaky> would hope for a bit more aggression
13:05 <Freaky> 100 million hash key sets, JRuby: merge! 24s, []= 12s; MRI 2.4: merge! 60s, []= 14s
13:06 <Freaky> seems like the sort of optimization truffle could do in its sleep
13:07 <segfaulty> feels like just yesterday but Ruby 2.0 was released in Feb 2013, and has been obsolete and unsupported for over a year already
13:07 <segfaulty> so I don't think it would be a stretch to start targeting it
13:08 <segfaulty> Freaky: did you profile that with creating a new hash every time, or using merge!(existing_hash)
13:08 <Freaky> existing hash
13:08 <segfaulty> that's not accurate then, since you are skipping the allocation
13:09 <Freaky> then I'm just benchmarking object allocation
13:09 <Freaky> switching from merge! to []= isn't going to change how the target hash is allocated?
13:10 <segfaulty> `x.merge!(key: value)` creates a new hash, has a lookup to set `h[:key] = value`, then passes the hash to #merge! which loops though each pair and does a lookup of :key, then it does `x[key] = val`
13:10 <segfaulty> so you need to profile the allocation as well as the extra lookups
13:10 <Freaky> why?
13:10 <Freaky> if x.merge! is creating a new hash, I *am* benchmarking that
13:11 <Freaky> moving h = {..} inside the loop isn't going to help if I'm interested in comparing how []= and merge! compare
13:11 <Caesium> doesn't he mean the hash you're passing to merge!
13:11 <segfaulty> ^
13:12 <Freaky> I assumed you were talking about a literal
13:13 <Freaky> n.times { h.merge!(bar: v) }
13:14 <Freaky> vs n.times { h[:bar] = v }
13:15 <Caesium> yeah, doesn't that allocate a hash of {bar: v} n times?
13:15 <segfaulty> that's correct, I asked if you were using `merge!(existing_hash)` which could have skipped the extra hash allocation
13:16 <segfaulty> it allocates the new hash and does a lookup to set :bar, the merge! has to lookup to pull :bar out again
13:16 <segfaulty> which is a lot more work than just []=
13:18 <Freaky> right
13:18 <Freaky> that bit of the line is all the way over ----------------------------------------> here and I missed it ;)
13:19 <segfaulty> haha
13:24 <Freaky> hm, don't see too many places that's actually used that I would expect to make a meaningful difference
13:25 GitHub122 joined
13:25 <GitHub122> [13sequel] 15janko-m commented on issue #1361: I'm not sure whether Ruby has a non-deprecated method for escaping strings, as `URI.encode`/`URI.escape` are deprecated (declared as "obsolete", but I didn't find any alternative in the [original discussion](http://ruby-core.ruby-lang.narkive.com/0XKxdn7l/ruby-core-29293-uri-un-escape-deprecated)). I was researching that in https://github.com/janko-m/shrine/issues/132, and I've come to the conclusion t
13:25 GitHub122 left
13:26 GitHub116 joined
13:26 <GitHub116> [13sequel] 15janko-m commented on issue #1361: I'm not sure whether Ruby has a non-deprecated method for escaping strings, as `URI.encode`/`URI.escape` are deprecated (declared as "obsolete", but I didn't find any alternative in the [original discussion](http://ruby-core.ruby-lang.narkive.com/0XKxdn7l/ruby-core-29293-uri-un-escape-deprecated)). I was researching that in https://github.com/janko-m/shrine/issues/132, and I've come to the conclusion t
13:26 GitHub116 left
13:26 GitHub143 joined
13:26 <GitHub143> [13sequel] 15janko-m commented on issue #1361: I'm not sure whether Ruby has a non-deprecated method for escaping strings, as `URI.encode`/`URI.escape` are deprecated (declared as "obsolete", but I didn't find any alternative in the [original discussion](http://ruby-core.ruby-lang.narkive.com/0XKxdn7l/ruby-core-29293-uri-un-escape-deprecated)). I was researching that in https://github.com/janko-m/shrine/issues/132, and I've come to the conclusion t
13:26 GitHub143 left
13:36 <segfaulty> that's quite possible, as I said I haven't looked at much of the code base, only random parts that I needed
13:46 <Freaky> e.g "create_table(name, options.merge!(:if_not_exists=>true))" - the extra hash is probably the least of your concerns :)
14:07 GitHub140 joined
14:07 <GitHub140> [13sequel] 15jeremyevans closed issue #1361: URI encode PG connection string before to URI.parse 02https://git.io/v9xZy
14:07 GitHub140 left
14:07 GitHub75 joined
14:07 <GitHub75> [13sequel] 15jeremyevans commented on issue #1361: This is a bug in the connection string, not in Sequel. The connection string must already be a valid URL. Sequel unescapes URL encoded options, so the connection string should be: `postgres://sequeltest:p%40ssword@localhost/bookshelf`. 02https://git.io/v9xzb
14:07 GitHub75 left
14:13 <jeremyevans> segfaulty: If you can submit noninvasive optimization patches with benchmarks on both MRI and JRuby that shows performance faster on both, they will definitely be considered
14:16 <jeremyevans> segfaulty: I think you'll find that in terms of the hot paths that actually affect overall performance, Sequel is already well optimized
14:16 <segfaulty> realistically, I probably won't have time to do that anytime soon
14:17 <jeremyevans> segfaulty: That's fine, I'm a patient man :)
14:17 <segfaulty> I just mentioned a few things I noticed while I was looking at it, like unnecessarily capturing blocks, so that anyone in here who may work on it is aware of it
14:17 <segfaulty> it's real easy for anyone to remove &block from method signatures and add some yields to save all that overhead
14:18 <Bish> sorry to interrupt: CREATE UNIQUE INDEX email_unique_idx on users (LOWER(email));
14:18 <Bish> can i do this in sequelß
15:20 <jeremyevans> segfaulty: did you know that's actually slower on JRuby?
15:21 <jeremyevans> segfaulty: And if you care about performance, you should probably be running JRuby anyway since it's significantly faster than MRI
15:21 <jeremyevans> Bish: create_index Sequel.function(:lower, :email), :unique=>true, :name=>:email_unique_idx
15:22 <segfaulty> that's really surprising to hear since it doesn't need to allocate an object
15:23 <segfaulty> JRuby doesn't support C extensions, which is a deal breaker for me
15:23 <segfaulty> if Ruby is too slow for something, and optimizing parts for a C extension doesn't make sense, I use another language like C#
15:24 <jeremyevans> segfaulty: In some cases such as Database#synchronize, Sequel already defines separate methods on MRI and !MRI precisely to avoid proc conversion
15:24 <segfaulty> that's good to know
15:24 <segfaulty> I looked at Sequel.synchronize I think
15:25 <jeremyevans> segfaulty: It's certainly possible that future MRI versions may be able to use static analysis to avoid proc conversion in the cases where the block is just yielded to later
15:25 <segfaulty> yeah, there is a lot of cool optimization coming to MRI in the future
15:25 <segfaulty> things will start to get really interesting once there is a JIT
15:26 <segfaulty> but the work done on deoptimization alone is very promising
15:26 <jeremyevans> segfaulty: Sequel.synchronize is not used as much as Database#synchronize, but the same optimization could be applied there. However, I'm not sure it would be measurable given the number of times it is called.
15:26 <segfaulty> not a real problem then
15:26 <segfaulty> from the bits that I saw, I think most hot path overhead is in hash lookups
15:27 <jeremyevans> segfaulty: That's probably not avoidable in the dataset case, as part of the design is that you can use arbitrary options
15:28 <jeremyevans> segfaulty: Most of the actual hot path when executing queries is in Dataset#fetch_rows in each adapter, which is heavily optimized on most adapters
15:30 tercenya joined
15:31 <segfaulty> you may be able to use ivars for common options in certain places, but some may not provide big enough gains to be worth it without doing code rewriting
15:33 <jeremyevans> segfaulty: In general storing dataset options in a hash doesn't significantly affect performance, except during SQL generation. And dataset SQL is generally cached after it is generated
15:37 <Bish> jeremyevans: thanks alot as usual
15:43 <segfaulty> that makes sense, the option I was thinking about specifically was :server, since I noticed that being passed and looked up all over the place
15:46 GitHub173 joined
15:46 <GitHub173> [13sequel] 15bbozo opened issue #1362: Postgres issue with notice_receiver callback, doesn't execute with "RAISE NOTICE" (but it does with "RAISE WARNING") 02https://git.io/v9xHD
15:46 GitHub173 left
15:52 GitHub153 joined
15:52 <GitHub153> [13sequel] 15bbozo commented on issue #1362: Solved ^_^ the constructor missed the `client_min_messages` parameter ^_^ 02https://git.io/v9xQo
15:52 GitHub153 left
15:52 GitHub192 joined
15:52 <GitHub192> [13sequel] 15bbozo closed issue #1362: Postgres issue with notice_receiver callback, doesn't execute with "RAISE NOTICE" (but it does with "RAISE WARNING") 02https://git.io/v9xHD
15:52 GitHub192 left
16:00 <jeremyevans> segfaulty: Chances are it doesn't affect performance much, since it's only a few lookups before the rows are retreived. Vast majority of time in retrieval is the per column per row processing
16:01 <jeremyevans> segfaulty: Unless you've benchmarked and can confirm it has a significant difference in actual workloads, you shouldn't assume it matters
16:10 <segfaulty> haven't benchmarked, it's just what I noticed
16:10 <segfaulty> for batch retrievals I'm already dropping down to the database driver anyway
16:10 <segfaulty> DB.synchronize {|conn| conn.exec("...").values }
16:11 <Caesium> is Sequel actually buying you anything, then?
16:11 <Caesium> if you're resorting to that.. why use it? :)
16:12 <segfaulty> it's just for batch retrievals, it's handling the connection pool at least
16:12 <segfaulty> I'm using models for other things
16:13 <mmun> i ran into an issue chaining two dataset methods that each join the same table.
16:52 <jeremyevans> mmun: That should be fine, as long as you use different aliases in each join
16:55 <mmun> yeah
16:57 <mmun> i feel like i'm missing something, because it feels like that way leads to defensively using unique aliases in all your dataset methods to avoid possible collisions
16:57 <mmun> probably too hard to talk about it without concrete examples :P
16:58 <jeremyevans> mmun: If you are joining to another table, you should probably know what alias you are using. That's as true in Sequel as it is when using SQL directly
17:18 tercenya joined
17:54 <segfaulty> one thing I'd really like, but would require a fair amount of effort, is to have models store their attributes in instance variables instead of a hash
17:55 <segfaulty> I frequently use cache model instances in long running persistent applications which access them frequently and do writes to the database less frequently
17:55 <segfaulty> so the overhead of hash lookups for accessing and setting attributes on the model instance is not ideal
18:07 <jeremyevans> segfaulty: that's definitely not going to happen, though you can certainly create an external plugin that does that if you want
18:08 <jeremyevans> segfaulty: I will tell you it will significantly decrease retrieval performance
18:08 <segfaulty> I may very well do so at some point
18:09 <segfaulty> which part of retrieval exactly do you foresee an issue?
18:10 <jeremyevans> segfaulty: taking the hash provided by the dataset and assigning each key to a separate instance variable in the model instance
18:11 <jeremyevans> segfaulty: That's a rows*cols operation
18:11 <segfaulty> right, it wouldn't make sense to do that, which is why I said it would require a fair amount of effort to change to ivars
18:11 <jeremyevans> segfaulty: The Sequel API is that Dataset#each yields a symbol-keyed hash per row, that will never change
18:11 <segfaulty> data would need to be pulled out as arrays of values
18:11 <segfaulty> not hashes
18:49 tercenya joined
20:08 tercenya joined
20:15 tercenya joined
21:47 tercenya joined
23:12 ta_ joined