Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

bash - Can't match regex with sed

I'm trying to match a pattern (^|~?)(d|x|*)+.(d|x|*)+.(d|x|*)+ with sed without luck. The file I'm running through is this:

{
  "name": "something",
  "version": "0.0.1",
  "description": "some desc",
  "main": "gulpfile.js",
  "directories": {
    "test": "tests"
  },
  "dependencies": {
    "babel-polyfill": "^6.7.4",
    "babel-preset-es2015": "^6.6.0",
    "babel-preset-react": "^6.5.0",
    "gulp-clean": "^0.3.2",
    "jquery": "^2.1.4",
    "lodash": "^4.0.0",
    "moment": "^2.13.0",
    "moment-timezone": "^0.5.0",
    "radium": "^0.16.2",
    "react": "^15.1.0",
    "react-bootstrap-sweetalert": "^1.1.10",
    "react-dom": "^15.1.0",
    "react-timeago": "^2.2.1",
    "sprintf": "^0.1.5",
    "smoothscroll": "~0.2.2"
  },
  "devDependencies": {
    "babel": "^6.3.26",
    "babelify": "^7.2.0",
    "browserify": "~12.0.1",
    "console-stamp": "^0.2.0",
    "estraverse-fb": "^1.3.1",
    "gulp": "^3.9.0",
    "gulp-concat": "^2.6.0",
    "gulp-sass": "^2.1.1",
    "gulp-sourcemaps": "^1.6.0",
    "gulp-util": "^3.0.7",
    "lodash": "4.5.1",
    "lodash.assign": "^3.2.0",
    "lodash.isfunction": "^3.0.8",
    "lodash.reduce": "^4.3.0",
    "node-sass": "3.4.2",
    "react-bootstrap": "^0.29.4",
    "react-intl": "2.1.0",
    "reactify": "1.1.1",
    "sweetalert": "^1.1.3",
    "vinyl": "^1.1.0",
    "vinyl-buffer": "^1.0.0",
    "vinyl-source-stream": "^1.1.0",
    "watchify": "^3.4.0",
    "jsx-to-string": "~0.2.11"
  },
  "optionalDependencies": {
    "pkg-save": "~1.0.2"
  },
  "scripts": {
    "test": "echo "Error: no test specified" && exit 1"
  },
  "repository": {
    "type": "git",
    "url": "someurl"
  },
  "author": "authorname",
  "license": "MIT"
}

As you can see in regexr it matches the desired pattern (also matching "version" but that's another issue I'll tackle later): http://regexr.com/3e324

I'm invoking invoking sed with the following command:
cat package.json | sed 's/(^|~?)(d|x|*)+.(d|x|*)+.(d|x|*)+/Hello/g' -r

For the sake of brevity, it outputs something like (ie. unfiltered input):

...
"dependencies": {
    "babel-polyfill": "^6.7.4",
    "babel-preset-es2015": "^6.6.0",
    "babel-preset-react": "^6.5.0",
    "gulp-clean": "^0.3.2",
...

It should replace all digits with "Hello".
What am I doing wrong?
Something to do with bad flags (I've tried /gm)
or not using the correct regex engine (I'm passing the -r option to utilize extended regex)?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

While POSIX regular expression support some named character classes, like [[:digit:]] and [[:alnum:]], they do not support shorthand classes such as d and w.

Some GNU extensions bring shorthand classes support, but they are restricted to a few of them, w, W, s and S according to regular-expressions.info.

By replacing the d in your regular expression to [0-9] I was able to transform your document. The regular expression becomes (^|~?)([0-9]|x|*)+.([0-9]|x|*)+.([0-9]|x|*)+, or better [~^]([0-9x*]+.){2}[0-9x*] (thanks Ed Morton !).

As a side note, your command could be rewritten to the following, which does not use cat :

sed -E 's/[~^]([0-9x*]+.){2}[0-9x*]/Hello/' package.json

And as noted by Matt, you'd be better off using a JSON parser such as jq.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...