Hacker News new | past | comments | ask | show | jobs | submit login
Useful sed scripts and patterns (github.com/adrianscheff)
528 points by adrianscheff on Nov 12, 2021 | hide | past | favorite | 121 comments



There are a ton of little errors in this post (at the time of my leaving this comment, of course): the variable delimiter example is missing the close _, the first word example is missing the open ', and the except line 5 example is missing a space; meanwhile, these snippets will cause bad habits, such as thinking about "words" as "things involving letters", using -r instead of -E, and a massive knowledge gap with respect to appropriate use of semicolons... I'm also really confused as to why sed is suddenly "s" in one of the examples. This just isn't a very good reference, and I'm not sure if the author thinks about how "c" works, but it really needs a \ afterwards, no matter what the one specific version of sed you are using might allow :/. As someone who loves sed so much he nigh unto regrets how much he uses it: "beware".

FWIW, I'd argue that the path to true sed mastery eventually goes through the hold space, which isn't mentioned here. For some real fun, and an exercise that might change how you mentally model what sed is capable of, check out SedSokoban.

https://aurelio.net/projects/sedsokoban/


Thank you for the feedback and the constructive criticism. Regarding the hold space I found it very confusing to use and understand. IMO it represents the 80/20 of sed (80% effort for 20% results).

You're right, I'm using -r even when it's not necessary. To my defense I think it's a good habit to have since without it regex expressions are painful to write. I didn't considered that using -E it's a better choice but I'll correct that now. (one might argue again that typing -r is easier than -E :D ).

Regarding the definition of words - I also thought of that when I wrote that snippet. I know it's not the complete regex for a word and that word regex patterns might differ. And I was probably a bit lazy - but I'll correct it presently.

I'd also like to say that I didn't write this as an absolute and ultimate reference. If I'm honest I wrote this as much to teach others as to solidify this knowledge myself. Now since it seems it gained traction I'm kinda obligated to make this better, no? Darn. :)

PS: if you'd like to help me make this better please submit a pull request or leave a comment here. Looking at your profile I see that you try to limit your online time so I probably shouldn've asked. :P


Most examples that use '#!/bin/bash' should be rewritten to use '#!/usr/bin/sed -f'.

Example "multiple replacements":

    #!/usr/bin/sed -f
    s/a/A/
    s/foo/BAR/
    s/hello/HELLO/
Save it as replace.sed; chmod u+rx replace.sed; ./replace.sed myfile.txt


Thanks, I've modified to use .sed scripts instead. I used bash scripts because I remembered (incorrectly it seems) that sed scripts were unsafe.


CentOS 7's sed doesn't support -E, but CentOS 8 does. That might cause some confusion.

Also, FYI, this is the link for the POSIX specification for sed:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/s...

Notice that -i is not POSIX.


No mention of ';' to have multiple command on one line.

Example:

    sed '10{p;q;}' myfile.txt


Thank you for your feedback! Personally I very rarely use the ; operator since it works badly with braces and commands that use files. However I've added an example using the ; operator.

PS: May I humbly point out that the command you provided will actually print up to line 10 and then a duplicate line. Like I said, unexpected (even though it seems logical). :)


Yes: forgot -n


While this might not be the ultimate sed reference, it turned me onto some patterns I haven't used before. The examples aren't meant to be used directly and will involve some trial and error before they're useful, but knowing what's possible makes it a very valuable reference.



100% agree.

This document shows practices of sed, but the author doesn't seem aware of best practices. A sed script starting with '#!/bin/bash' is a bad practice.


Ha, you got me there. I remembered incorrectly that sed scripts where bad practice and one should use bash instead. I've corrected the mistake.

What other good practice suggestions do you have? (if you're comfortable answering and have the time)


I really dislike sed, awk, and all these utilities because I just can’t remember how to use them. Every time I want to use one of these tools I have to relearn the stuff, and then after a while I forget how to use it. If I don’t use it all the time I just forget. I really don’t think non-interactive CLIs are a good way to do complicated tasks.


That’s just because you don’t use them often though. If they were a daily driver it’d be different.

I think that if you have a good mental model of how awk works, then you know times that it’s the right tool for the job. And on those occasions you’ve got plenty of time to go search for some syntax, make a coffee, maybe even go for a run all with change to spare compared to using pretty much anything else.

I will admit, that if I can do it with a few greps, seds and cuts then I tend to just use those first, because there no thinking involved for me.


> I really dislike sed, awk, and all these utilities because I just can’t remember how to use them. Every time I want to use one of these tools I have to relearn the stuff, and then after a while I forget how to use it. If I don’t use it all the time I just forget.

I'm going to let you in on a little secret: many of us who have been doing development or systems administration in some form another for multiple decades are the same way. I have a terrible memory, but it turns out that does not have to be a big hindrance for tech work.

When I'm working (or tinkering at home), I keep a tab to my personal wiki open at all times. I will not usually bother to write something down the first time I do it. But I have to google it twice, I throw it into my wiki. This does two things: 1) It makes me (slightly) more likely to actually remember it for next time, and 2) The next time I need the information, I know where I can find it without having to wade through SEO-encumbered blog spam or outdated StackOverbutt answers.

A database of personal notes is a powerful multiplier for technical ability and productivity. To the point that whenever I interview a candidate for a job, one of the things I ask is what system they keep their notes in. Their system doesn't matter at all to me but if the answer is none, that's a definite strike against.

>I really don’t think non-interactive CLIs are a good way to do complicated tasks.

Many of us thrive on the Unix command line because of the raw power you can wield with it. "Why waste time trying to learn arcane shell commands when you can just write a script in Python?" is a common refrain I hear from developers with only a few years under their belt. My response to that is, "Why would I waste time writing a Python script when I can do it in one line of shell?"


This is what I do as well, I don't understand why programmers try so hard to avoid ever taking notes (or reading a manual). It's somehow too boring for programmers to work in a structured way, they want to pretend that their job is like playing a video game. I don't understand what the problem is of looking something up, and why is it so "bad" to read a little syntax before you get started, especially if it gives you better solutions.

Imagine having this attitude towards tools in general, you'd throw out most of everything. Anything that's powerful has a learning curve


I 100% agree: If I have to look something up twice, it's going in my wiki.

The first time you look something up, it might not be obvious if you'll ever use that info again, so why waste the time and clog up your notes with a bunch of noise? But if you've had to look something up twice, odds are good you'll have to look it up again someday.

And like you said, writing a note in a way that's clear and understandable helps you to recall the thing a little better next time


I do the same thing with obsidian and use the git plugin to keep it in sync (via gitlab) across everywhere that I use it. I know people who similar with .org emacs files. It works pretty well. I don't have the patience to maintain a wiki :)


> personal wiki open at all times

Any suggestion for a remotely accessible personal wiki? (I'm assuming that access is restricted for potential "personal" stuffs).


I'm not big on wiki's but almost everyone I know hosts one on something like vultr/digital ocean, backs up to home, and does what OP mentioned about updated it in the backgroung. Personally I use markdown files+gitlab+obsidian. there are a million ways. You might like like twiki for a "standard" wiki or tiddlyWiki for a more eccentric one :)


Which is why I've always stuck to `perl -pi -e`.

It may not be as powerful but it's just one thing to learn. Also it understands regexs like '\d', etc which are kinda universal but for some reason don't work with many of these utils.


I wouldn't call having the de-facto standard for an advanced regular expression system, a full programming language and the entirety of the CPAN module system easily pulled in and used less powerful.

I use sed for a lot of simple stuff, but I use Perl all the time for more advanced stuff. Perl is great for grabbing complex chunks out of a line and doing something with them that requires a bit of state. For example, writing a quick and dirty logfile parser to figure out how many requests and how many bytes were used by each IP over a time period (the time period limiting usually being a prior grep before being passed to Perl). I've probably taken 60 seconds to write out something to parse exactly what I want from a log hundreds of times.


Perl is certainly more powerful than sed or awk :)


I also read once that perl was the most ubiquitous tool across all Unixen flavors, but I don't have a link to cite for it


I'm mostly with you (except for your last sentence). I've picked up a lot of languages over the years, and can fall back into most of them pretty fast, but sed and awk especially are hard. On the occasions that they really are the best tool for the job, I get actual books out and set aside a window of a few hours to plod through it.

I think the problem is that they fall into an uncanny valley between "simple tool" and "verbose programming language"; they are really terse languages wearing a simple tool's sheepskin.


If I measure time to solve a problem and not line count, using a language like Python or Ruby might be 2x the lines , for a line count approaching zero anyways, not sure that matters. Performance is comparable and I don't need to relearn anything and someone else who doesn't know awk/sed doesn't need to learn it.

And when one is using awk/sed that script is often invoked from a shell, bash for instance. And since the awk/sed can't hold its own as a general purpose programming language, now I am writing bash code, a personal sin.

Awk/sed made sense for the time and place they are from, but things have changed.


> Awk/sed made sense for the time and place they are from, but things have changed.

Don't do that. Do not assume something is of no use because you don't know how to use it. Double that recommendation for tools that have widespread use. There is a reason they're used everywhere.

In the case of awk/sed/perl: Nothing beats them for composition in one-liner text processing pipelines.


plus, they exist in the vast majority of unix like systems. You don't have to install anything. They are just there and available to use. Even using them in a simplistic manner is light-years ahead of trying to do stuff manually.


I have moved on to structured data, awk and sed handle line oriented text formats.

Don't assume I don't know how to use it. :)

One line verses ten, I'll take the more readable thing over the APL.


>Performance is comparable

Can't find links now, but fairly certain were stories in past about how switching to AWK from other tools increased performance


Yes, this is a Standard HN Reply: the "Specialist's Rebuttal". It's usually in some form of, "___ is obsolete, ___ is better in every way, everyone should just be using ___ now."

Okay, acknowledged. Python and Ruby are superior in every regard. Would you mind letting us chat about sed and awk now for a minute?


"Be kind. Don't be snarky."

"Please don't sneer, including at the rest of the community."

https://news.ycombinator.com/newsguidelines.html


I replied to someone's "There is no reason for anyone to program in C nowadays" recently on here with "But I like programming in C". They hadn't considered that reason in their mental survey of Every Possible Reason.

Also.. I use Awk every day, for all kinds of things. I forget the order of parameters in split, sub etc, which takes a few seconds to look up with man awk. I got into sed a while ago but never use it. Awk, together with sort and uniq sometimes, is all I need.

And it is interactive. You have an input. Print that out and look what needs to be done. Write the first transforming step in Awk. See what that did. Add the next step. See what that did. Extremely interactive. After a few minutes I have a one-liner (often 2 or 3 wrapped lines) that does exactly what I want, and easy to adapt to other similar things. I have very large bash history enabled so they stay Ctrl-R-accessible.


[flagged]


Would you please not post in the flamewar style to HN, regardless of how provocative another comment was or you feel it was? This is not at all what we're trying for here.

https://news.ycombinator.com/newsguidelines.html


I don’t know what flame war style is, not playing dumb. I think the person behind the provocative comment isn’t much different than me.

I thought my comment might land ok, which due to a lack of response might have. It explained educated and <funny joke>

I might have typed the same thing in a different time.

*edit, it would be fun to have a forum board that allowed for out of band semantic information so … being cut off


Actually when I re-read your comment it doesn't seem so flamewarrish to me—sorry! a lot of the time I'm skimming these things in haste.

A couple bits that do correspond to 'the flamewar style', though, are when users start telling each other to "chill a bit"; and when people start arguing about "moving the goal posts and putting words in my mouth". It's a reliable sign of bad internet discussion when people start arguing about they are or aren't saying (rather than talking about whatever the ostensible subject was).


Wait I need to toss an extra bomb over the wall and point out the superiority of swapping "sed" out for "perl -pe" and getting everything sed had plus superior regex without needing to learn much of anything new.

(I might be wrong about "everything" but that's what bomb-tossing is all about)


Welcome to the discussion. I like it, I haven't perled in a while but does that mean you can evolve from a one liner to a full program using all of perl using this technique?


I suppose it's a "pathway to perl" in the sense of a mild gateway drug. Perl is definitely one-liner-friendly. Every time I get into Perl I have to relearn it all over again (which is apparently not an isolated phenomenon), so I don't have great suggestions for digging further. I just habitually reach for perl -pe about 3/4 of the time.


I really wish all these bash scripts would be replaced with python scripts.


Yeah I redid some scripts on my team for my own usage because our team bash guru maintained them and didn't want anyone to change them. So I did my own versions in python. Sure they were mostly 2x+ as long but they were very easy to read and modify and test. The bash guru finds out and brings it up at a team meeting trying to shame me, but it ended up with my boss asking the team, and they voted on using my scripts in lieu of the older bash scripts as I had written a few unit tests around them and the rest of the team understood them without digging out unix tomes.


I have done something similar, what I really found worked well was to wrap the bash scripts in python to instrument them and if they just operate on data, you can even spawn another command and run them both at the time confirming that the output matches.

Great way to have confidence in a port and provide for a fallback path. Eventually the new code will take over the old code and the old code will wither and die.


It seems like when a bash script gets over say 50 lines or so I start doubting whether something should be a bash script :) . I saw an IRC client written entirely in bash once and it was quite a resource of esoteric bash scripting haha.


I gave up on them except for the very most common uses of them for slicing/dicing text, after the most essential basics I just turn to python and suck it up. Learning to pipe things in/out in python is a great skill to have in these circumstances. I think perl is the king of this but I'm not going back to that... never again :)


How long does the relearning take exactly? My experience has been that if you've used these tools before, it's pretty easy to remember or refresh your memory. In that sense, they're not very different from other tools/syntaxes that you haven't used in a while.

And if you do use them frequently enough, you might not even need to refresh/relearn anything.


I basically gave up on relearning them considering the number of times I’ve had to do it and the amount of effort involved to relearn them. Maybe it’s just me, but I just write python now.


Makes sense if python is what you otherwise use for everyday work, it's always easiest if you can avoid context switching.

To me it's bash, sed and awk for the basic text-processing things, and then perl for anything that requires more programming. These are very similar syntaxes (and ways of thinking I'd say), so they fit together great - and also are close enough to php and js that I do at work to make the transition painless. But if I had to write it in python or ruby or God forbid C++ - which all I used a lot, but long time ago - I'm sure it would be a big struggle for me now.


The full specification for the AWK language is 40 pages in the book written by the authors.

I sometimes forget the exact syntax for pipes and file open/close, but it's not hard to find in 40 pages.

AWK is in busybox, and a lot of other places that Python simply cannot go.

The control structure syntax is also the same a C/JavaScript/PHP/C++ etc., so it certainly does not hurt to know it.

https://archive.org/download/pdfy-MgN0H1joIoDVoIC7/The_AWK_P...


You need to have a mental model for each tool and some idea of syntax associated with it. Once I learned about IFS,OFS, BEGIN, END and {}, as well as the fact that variables and their references have the same syntax, unlike in Perl... I could pick up coding in awk at any moment


Perl has BEGIN and END as well, and when using it with -n or -p for line processing mode uses -F to set word boundaries just like awk. In fact, it shared many flags and features with awk, which it was meant to be able to be used as a more powerful version of the same tool.


IFS is the POSIX shell; in awk, it's just FS.

I think by {} you mean an argument to the find command.


Indeed. Thanks for the hint.


Haha, one of the reasons I wrote this. It's also a way of practicing and reminding myself sed.


This is everyone :)

I love awk and whenever I __do__ find a task where awk helps me, it's really an amazing experience and it is always a boon, never a burden.

But I really suck at remembering the syntax as I only hit tasks that awk solves well rarely. What I do remember is situations where awk definitely can likely help. Same with sed, tr, etc.

Keep in mind, every time we automate something or introduce a new tool/process, it's a cost benefit analysis; when I have dozens of gigs of logs to parse to figure out which hosts out of hundreds are having issues, it's a no-brainer for me; the 30 minutes to revisit a few awk and bash tricks is a far better investment than trying to grep/scroll my way through literally millions of lines of logs/code.

When I first learned awk I definitely went through a 'hammer' phase (i.e., when you have a hammer, everything looks like a nail); once phase passed, I find that I'm much more disciplined on when I break out awk or similar tools.

Ultimately it's about identifying when you have the "right" tool for a specific job; I might need to invest time to revisit some syntax, but if the end result is that my 30 minutes saves hours, then it's a good 30 minutes. I'm at the stage where the basic text munging with awk is a no-brainer (or at least I remember the mistakes I made previously and how I fixed them); this was achievable only after a few projects where I did have to spend a bit more time wrangling awk than I preferred, but it saved me a lot of time in the future. Just knowing what I __can__ do with a given tool helps me make educated decisions on where I should invest my time on a given issue.


May I ask what is the alternative? Creating an entire application for a single use? designing a generalized case and making a monolithic application with lots of obscure options?

What about setting up the project for the application, setting up instrumentation and testing, debugging, etc.?

Even supposing that all that extra work makes sense for your unique string manipulation needs, what about the next time something like this comes up but is out of scope for your first tool? Do you end up with lots of little applications, with lots of little features and arguments? When you come across that similar-but-not-quite need down the road, do you have to scour your library of full blown apps to find something that fits? Not to mention maintaining them against future system updates...

OR - you could just force yourself to get over the hump, learn the conceptual basics at play and then when you come across a novel situation down the line you can just rattle off a tailored one-liner and have your answer in a few moments...


Since sed is the stream version of ed/vi, it used to be easier to use and remember for people who used ed/vi as their primary editor.


100%. The other day I wanted to do some transformations on a file using the like of regex, sed and grep. After 1 hour I just gave up and thought "hey, why not try doing it in Python". It took me just 5 minutes to write a small script, which was eye opening. Since then, I just write python scripts instead of toiling with commands I don't know.


Personally I'm fine with using a complicated expression (or a pipeline) built out of basic parts where it's not necessary (because there's some whizzbang operator that'll make it so much more succinct) if that's what I can remember or what is clearer to understand with fresh eyes.


I mostly find them useful when you can't be sure anything else will be installed.

What we really need is something that can "compile" down to sed for deployment in that kind of environment.

Then that compiler can contain all the weird stuff you need to remember.


In these instances where you find a one liner useful but you use it rarely, just put it in a function with comments on how it works. When you want to use it the function is ready and the explanation too.


cheat.sh is your friend. E.g for sed:

curl cheat.sh/sed

Will show you several examples.


cheat.sh & cht.sh < Nice! https://github.com/chubin/cheat.sh

Also, LDP: Advanced Bash-Scripting Guide: https://tldp.org/guides.html


awk is always a struggle for me to re-learn, but sed is simple.

I basically use it for simple text substitutions; in that case it is the same as what I use for vim editing.


Nice descriptions, but some of them need changes to match the command. For example:

    sed '1~2p' file.txt #needs -n option

    s '1,$ s/foo/bar/' file.txt #s --> sed and 1,$ --> 5,$
Many commands using `-r` do not need the option for the command used (for ex: `sed -r '/start/q'`). Also, using `-E` is preferred instead of `-r` since some of the other implementations support this option but not `-r`.

---

I wrote a book on GNU sed with plenty of examples and exercises: https://github.com/learnbyexample/learn_gnused It is free to read online and there's a detailed chapter for learning BRE/ERE regex flavor as well.


Thank you for the constructive feedback, I've corrected the mistakes pointed out. I've also changed -r to -E based on your (and some other people's) advice.


Instead of wrapping sed with a bash script to get multiple expressions to do their things, I normally use 'sed -e'.

sed -e s/a/A/ -e s/foo/BAR/ -e s/hello/HELLO/

I use -e often enough that I usually use -e whether I expect to use more than one expression or not. There's a reasonable chance that I will be going back into my history and will need to add another, which is easier if the first -e is already there.

Also, 'sed -e expr1 -e expr2' gives the same results as 'sed expr1 | sed expr2'. In both cases, the order of the expressions matter: later expressions may change things altered by the first.

  $ echo foo | sed -e 's/f/g/' -e 's/goo/go/'
  go


Then there's sed -e 'cmd;cmd;cmd' as well.


Something I crave in the worst way is a sensible story for accumulating a “toolbox” of snippets and scripts and commands.

They don’t need to be ready to go, but ideally:

- natural language searchable

- add a small description

- CLI to search, examine, and copy

- not a <favourite text editor> plugin

- sync with a GitHub repo.


One of the best hackers I ever worked with turned me onto this method.

Use a flat text in the editor of your choice. Search as needed. Tags work because you can seach them. You delimit each section as a note. I do it like this

---

Next note

---

It seems limited and primative, you don't get any markup or fun stuff. But it's soooo good. There is nothing to break or update. Seach is the method I use most anyway in my other note taking apps.

And you wind up putting everything in there, including commands that you're staging to run. Those staged commands become snippets later.


I still use flat text files for 90% of my daily note taking. Have done since the early 90s, and the same notes I took then are still working in the same folder I use today. I sync it to my various devices (every device and platform has a text editor), and it’s never failed me.

I have txt files for commands I want to remember for each OS I daily use (linuxcmds.txt, windowscmds.txt, maccmds.txt) and split them the same way you said but 5 dashes. Also keep txt files for install/setup notes for each os, links to every program I want to install in fresh installs, etc etc.

Text files work on every single OS/device, the format is stable, it’s easily searchable, easily shareable, and my entire 30 years of notes is measured in megabytes.


Text files are good, but even more primitive and also awesome: I find a sheet of paper taped to the wall in my office is really underrated. Being able to just look up and see it is awesome, as well as the natural limitation to only as much stuff as will fit on a sheet of paper.

Every now and then I start a new sheet, dropping off the stuff I've memorized.


I keep a commands.org file for exactly this purpose. Whenever a command is useful to maintain outside of my shell history, it gets documented and placed there, along with variations, things to look out for, etc.

Eventually I'd like to publish it as a microblog for such snippets ala commandlinefu.com, but priorities and laziness get in the way. :/


Many people maintain those in a text file, somewhere. My approach is slightly different. I keep them in my shell history file itself.

First, set your HISTSIZE to 1 trillion. Then, tag the relevant commands in the history file with a comment after the command (e.g. sed -n '5,10p' some.log # print selected lines). It gets searchable within the shell, either by the comment or the start of the command. Always at my fingertips.


I use .txt files for most of my tasks: all sort of lists, notes, snippets, even organized bookmarks (with comments). I use them also with some indentation and simple geometries for structured information (table-like or hierarchies).

Some of them are shared via DropBox and despite the risk of simultaneous edits we had basically no incidents: the simplicity and efficacy justifies it for us.


I stumbled on this lately, which I've borrowed myself and I'm finding it pretty convenient: https://ianthehenry.com/posts/sd-my-script-directory/

Shell completions are essential for it, so I had to stumble through writing my own for Fish, but now that it's setup it's quite nice.


Writing shell completions is such a pain (and fish is still way ahead of zsh or bash).

Btw if you like fish autocomplete, you might be interested in fig.io.

We've spent a bunch of time making is super easy to add your own completions for scripts or custom CLI tools. :)


I had the same craving.

My linux only snippet manager: https://github.com/barbuk/snippy


I think you could just use a text file and then let chezmoi manage the distribution via git.

Or a personal wiki, I know people that run their own wikis for just this sort of thing


I find tldr is good for rarely used commands:

https://tldr.ostera.io/sed

(That's a web version but I usually access it on the cli or in my editor)


tldr is great! I personally use this faster implementation in Rust: https://github.com/dbrgn/tealdeer

And I just became aware from reading the README that there's an even faster one written in Zig.


I understand the usefulness of this kind of command listing for people who just use sed occasionnaly and want to quickly copy/paste something still I think just learning how sed works is actually easier than treating it like magic. It's not awk. Sed is both logical and simple.

Sed is the silent version of ed. It's a full line-oriented editor. You give it commands and it executes them on lines, one line at a time. Commands always have the same form:

  [lines range] [action][parameters]
You just have to learn how ranges are defined and what the actions are then you can check the documentation for what the parameters are when you need to and you will be good to go forever.

As a bonus, once you realize there is more actions than just s for substitute, you start realizing that people sometimes do very convoluted things with sed because they don't know how to use the other actions. For example, there are actions to join lines (j), to delete lines (d), to prefilter lines based on a regex (g and G).


> I think just learning how sed works is actually easier than treating it like magic. It's not awk.

As someone with over a decade of heavy sed use, who only got into awk recently, i'd say awk is far easier and less magical. Awk is just another programming language, for the most part. Using sed for anything substantial requires you to think in strange directions, to find ways to thread state and control flow through the tiny holes the language gives you.


I think I both agree and disagree.

As you rightfully pointed, I wouldn’t do anything overly complicated with sed because it really is a line editor and anything which is not line based and can’t be explained simply in the context of using an editor will quickly seem clunky and overly complicated.

I wouldn’t do anything period with awk. I agree with you that it’s just another programming language but one which feels extremely dated with both a terrible syntax and awkward semantics. The rare times I have encountered something done with awk it was either so simple it should have been done with sed or so complicated I wish a proper scripting language like python or perl would have been used instead. As far as I’m concerned, awk is a tool without a proper use-case. It’s always better to use something else.


Interesting! I don't find awk too strange; indeed i strongly prefer it to Perl, which i wrote a lot of around the turn of the century.

These days i reach for awk for a lot of log processing, particularly when it needs to be stateful. For example, i have a log file with entries for remote clients connecting and disconnecting. An awk script can loop over that and keep an array of clients, adding and removing them as appropriate, and be easily tweaked to report different things - emit an event for each connection-disconnection pair, or track the number of clients connected over time, etc. I couldn't do that at all with sed. It would be more awkward with bash. It would be easy enough, if a little bit more verbose, in Python.


Sed is probably the one tool I owe my career most to. I made a living mostly on it for a few years as an ops guy, and I didn't even know half of these use cases. The variable delimiter alone was such a beautifully thought out decision that made everything easier, that I likely would have never considered had I written it.


I'm glad to hear that. I'm learning (and re-learning) sed myself, specially after long periods of not using it. So in a way I've written it for me too.

Btw, there is one more use case of a variable delimiter which is more arcane (and you can combine it with the other custom delimiter) `sed -s '\_/bin/bash_s:grep:egrep:' myfile.txt`



I use sed -ibak ‘10d’ ~/.ssh/.known_hosts so many times a week. awk, sed, and regex really are the two programs I am happy to have learned well at my first job. They’ve helped me immensely throughout my 10 years as someone who bridges developer and ops roles.


For that one specifically, you may want to check your copy of "ssh-keygen" as it has recently(?) learned to manage those known_hosts entries without resorting to text file manipulation (`ssh-keygen -r something.example.com` for example)


Mastering one tool is probably wise, but it's also good to know which tools may be better suited for some purposes.

For "keep the first word of every line", I prefer awk:

    awk '{ print $1 }' < file.txt


I agree with the people here saying that it's better to just use your favourite scripting language unless your job is mostly about writing little bash scripts like that (maybe sysops people?)...

Here's a groovy script to do the same:

    new File('file.txt').eachLine { println it.split()[0] }
Not as neat as awk, but not much worse either... and as I use Groovy a lot for testing, and its syntax is simplified Java (which I use a lot as well) it's just much better for me... if I had to ssh into other servers where Groovy or Lisp (my other favourite "scripting" language) are not available, then yeah, I would learn awk better (I actually like awk, and use it sometimes for the rare occasion I don't have groovy/lisp installed).


In cases where it's a one time use, anything that gets you an answer easily is great. Could be Groovy, Ruby, Python, Perl, etc. or of course these bin tools.

But if it's going to be used a lot, and start-up time matters, then some of the full programming languages can become expensive for the temporary load they create just spinning up. In those cases, the standard Unix tools tend to be so much faster and lighter that it's worth learning them.


Perl is much better than Pyhon for that for obvious reasons, Perl is like an improved sed with stereoids with some C inspiration.


Use the best tool for the job.

sed is just a DSL (domain specific language) specifically designed for for manipulating text files line by line.

Note that for matching strings even in Groovy/AWK/Python/Perl/... you'll probably use regexp which are another DSL. Would you recommend to use string functions instead?


> Would you recommend to use string functions instead?

Actually, yes, of course. Regex should be the last resort, really. It's unreadable, presents security issues depending on where it's used, and much of the time using string functions in a "proper" language will be actually easier unless you're a regex expert (which is a pretty silly thing to acquire expertise on).


I adore Groovy the language but am compelled to note its horribly slow bootup time. On a commodity laptop it's usually a couple of seconds - acceptable for some things, but hard to justify for one-liners.


>For "keep the first word of every line", I prefer awk:

Me too, though the awk version throws out any leading whitespace, so it's not quite the same.


I'm annoyed at how sed has --regexp-extended, while grep has `--extended-regexp`. I'm sure there's a reason for it, but it always makes me stop and think which command has which flag.


This is awkward… I know it’s hip and cool to ask for donations, but for a tiny collection of sed scripts which aren’t including detailed explanations?

I feel like sed is what deserves the donation…

http://sed.sourceforge.net/sed1line.txt ?

https://www.fsf.org/about/ways-to-donate/


Hey there! Thank you for the input! I'm sorry you feel that asking for donations is unwarranted. I agree wholeheartedly that FSF deserves all the love it can get. I'm not asking to get rich or take the spotlight - I'm asking so I can keep improving that guide & create more.

I didn't added explanations (although I wanted too) since this was intended as a quick tips page for sed. I think there are much better guides than I could ever make (including the info page). In a way it works better since it's more digestible.

Thank you for the link you provided - it looks awesome! I'll look into it and append the existing guide if needed (while giving credits of course)


When I see anything that leaves a backup file as .bak with inline replacing, it makes me reminisce over Perl. Perl one liners are still hard to beat.


sed actually provides a scripting language, and s/foo/bar is just statement in that language. This surprised me when I first learned about it.


What is the easiest way to make sure a "field, or option" is present in a file. Similar to what ansible's lineinfile does?

For example suppose I need to add this setting to /etc/ssh/sshd_config:

PermitRootLogin yes

One idea that comes to mind is to first delete the line with the line shown in this post, and then add it back:

Regex sed -E '/^#/d' file.txt - delete lines where regex matches

sed -E '/^PermiteRootLogin/d' /etc/ssh/sshd_config ; echo 'PermitRootLogin yes' >> /etc/ssh/sshd_config

Is there a more straightforward way with sed?



> Print one line

> > sed -n '10p' myfile.txt

Which line? Why p? I can make assumptions that this means “line 10, print” but easy-to-explain could go a small step further and complete the explanation.


You can avoid typing errors in commands or code as part of Markdown files (or any other markup language) by using some form of 'literate programming' like Jupyter notebooks. The default IPython kernel supports shell by putting a '!' before each line of shell commands. You can get Jupyter kernels for almost any programming language - though many work on Linux only.

Or you can use Org-Mode (with babel), if you're an Emacs user.

That's how I write all documentation involving some sort of code.


"keep the first word of every line (where word is defined by alnum chars + underscores for simplicity sake)"

  sed -E s_[a-zA-Z0-9_]+.*_\1_' file.txt
You're missing an opening single quote and a set of parentheses for this to work. It feels like you should have a unit test setup for these snippets, given that there are other typos found in other comments here.


Waiting for an AI powered shell, or something similar, so that one can avoid wasting time on these trivial tasks that require too much attention if solved with traditional methods:

"I made an OpenAI-powered Linux shell that guesses your bash command": https://www.youtube.com/watch?v=j0UnS3jHhAA


Remove csv header.

     sed 1d
Select line 123

    sed -n 123p

    # or
    sed -n 123{p;q}


sed -n '123{p;q;}'



wrote this ages ago, if you want something to put all your office docs and pdfs in a solr database on FreeBSD (indexable by page number), knock yourself out...

https://gist.github.com/awhileback/1fffac899d9321a6a9ec15bc9...

Requires some ocr / pdf parsing / metadata tools, see the comments.


“s '1,$ s/foo/bar/' file.txt”. should be replacement from 1 to end, not 5 to end (#4 on your list)


I just submitted a PR to fix that. :)


…and I closed it as a dupe already. I was seconds slower than the other contributor. :)


Is there a way we ask these kinds of question in plain english? and then have it return a snippet like this


StackOverflow


StackOverflow: "There are no stupid questions, but there are a lot of inquisitive idiots. Closing as duplicate of <unrelated question>."


There were efforts to do that for general programming; traces of those efforts survive today as COBOL and SQL.


Sounds exactly like GitHub Copilot.


Have you submitted them to commandlinefu.com?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: