Hacker News new | past | comments | ask | show | jobs | submit | gpsarakis's comments login

Sphinx does indeed a very good job and is continuously improved. It is very fast and with small memory footprint.

We also built https://techusearch.com which uses Sphinx indexes (think of a SaaS version of Elasticsearch backed by Sphinx).


Interestingly enough, there is one unofficial API http://pygments.appspot.com/, but supports an outdated version (1.5).


Great work! Do you consider PyPy support also?


I've considered it, yes --- but it's hard. Currently I segfault under PyPy. I've got the learner and hash-table code working, but I need to debug the NLP. I suspect it's the way I'm interning my strings.


pigz does parallel compression, taking advantage of multiple cores. I am not really sure if you can achieve this with scp.


You can use pigz with SSH as you can pipe commands over SSH (google it). If nc is faster than scp, I guess encryption is a factor, but then they're not comparable solutions to the same problem.


Sure you can :). Naturally transfers via SSH "suffer" from the encryption overhead but prevent MITM/network sniffing etc. The author points out that (only) in a trusted LAN you could use this solution to make things go faster.

I guess the title should be a little more mild - scp isn't going away, or rsync via SSH for that matter.


Thumbs up for pv which shows the progress bar. Using parallel compression may have serious impact on CPU resources, so it needs to be balanced.

Have you also done any testing with ssh + gzip?

Also, as you note at the end, the security concerns are not trivial.


Great read! An effort on a DSL using PLY (Python LEX-YACC implementation): https://github.com/georgepsarakis/quick-bash .

It is actually a transcompilation to Bash from a functional language, using Python for the intermediate processing.


Probably you can, if you implement upload callbacks which can be passed through the success_action_redirect parameter. Not sure though.


It says that it only stores metadata locally not objects themselves (although it could probably have an option for that too and serve as a general S3 cache).

Nice effort, Redis is perfectly fine but I believe that the storage layer should be somehow more separated in case someone wants another type of storage, e.g. in-memory SQLite is adequate and already installed in most systems.


I agree about with you about a replacement for redis. Maybe it's only me, but I seem to be relying too much on redis for simple things. I feel that soon enough I'll need 4 independent redis servers for my production environment.

That's of course tongue-in-cheek, but there's a degree of seriousness.


Good idea. Maybe in the future it will support different backend databases. For now, Redis is perfect.


Taking a quick look at the source code directory structure https://github.com/jquery/jquery/tree/master/src, I believe that you could create a minimal, customized build containing only features that you actually want. For example only element selection (which is very convenient and fast IMO) and AJAX.


CSS selectors are much easier to remember than XPath. Python's BeautifulSoup allows you to select elements with selectors and is very convenient. XPath is a bit more verbose and most people already are familiar with CSS syntax.


And indeed, any CSS selector can be converted to an equivalent XPath query, at least for selectors on XML and HTML. http://pythonhosted.org/cssselect/ is a Python implementation of such a conversion. (Note that there is no XPath to CSS selector converter, as XPath can express certain things CSS selectors cannot, as CSS selectors are designed such that they can be matched using a streaming parser as soon as the first child of the element appears.)


I'd say more than a bit. When you have multiple namespaces in your xml it can become so verbose that it's hard to see the signal through the noise. But then, maybe there's a way to reduce that noise in a way that I don't understand.

Long comment short, I agree. CSS selectors are easier to understand and read.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: