New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: disable file:// urls when hardening enabled #24858
base: main-2.x
Are you sure you want to change the base?
Conversation
Stacks and templates allow specifying file:// URLs. Add option to disable their use for people who don't require them.
Fix tests on Darwin / MacOS. Also fixes spelling issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first pass; I can do a deeper review another day. Mostly about file names in errors...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One question, not blocking.
// limit how much we read into RAM | ||
var size int | ||
size64 := st.Size() | ||
if size64 > limitReadFileMaxSize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add a && limitReadFileMaxSize > 0
in case it becomes a variable some day? Not blocking on merge, just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Most of my in-line comments are observations, but there are a few places where it would be good to update comments (eg, remocal and zero size files in /proc).
I also did some manual testing (using the amd64 deb from https://app.circleci.com/pipelines/github/influxdata/influxdb/38891/workflows/ec8d697d-ee31-46a3-9f15-2f1238f2eb8b/jobs/363718/artifacts) for /api/v2/stacks:
$ cat ~/stuff/ok.json
[
{
"apiVersion": "influxdata.com/v2alpha1",
"kind": "Label",
"metadata": {
"name": "label-1"
},
"spec": {
"color": "#eee888",
"description": "desc_1"
}
},
{
"apiVersion": "influxdata.com/v2alpha1",
"kind": "Bucket",
"metadata": {
"name": "rucket-1"
},
"spec": {
"associations": [
{
"kind": "Label",
"name": "label-1"
}
],
"description": "desc_1",
"retentionRules": [
{
"everySeconds": 10000,
"type": "expire"
}
]
}
}
]
# in one terminal, start a http server to serve this over http://
$ cd ~/stuff
$ python3 -m http.server
# via http://
$ curl --insecure -H "Authorization: Token $TOKEN" --request POST "$URL/api/v2/stacks" --data "{\"orgID\":\"$ORGID\", \"name\":\"some name\", \"description\": \"some desc\", \"urls\": [\"http://localhost:8000/ok.json\"]}"
{
"id": "0ce72523377b4000",
"orgID": "697959a536e728a2",
"createdAt": "2024-04-17T10:05:13.181224325-05:00",
...
$ influx apply -o $ORG --stack-id 0ce72523377b4000
LABELS +add | -remove | unchanged
+-----+---------------+------------------+---------------+---------+-------------+
| +/- | METADATA NAME | ID | RESOURCE NAME | COLOR | DESCRIPTION |
+-----+---------------+------------------+---------------+---------+-------------+
| + | label-1 | 0000000000000000 | label-1 | #eee888 | desc_1 |
+-----+---------------+------------------+---------------+---------+-------------+
| TOTAL | 1 |
+-----+---------------+------------------+---------------+---------+-------------+
...
$ influx stacks rm --stack-id 0ce72523377b4000
# now try with file://
$ curl --insecure -H "Authorization: Token $TOKEN" --request POST "$URL/api/v2/stacks" --data "{\"orgID\":\"$ORGID\", \"name\":\"some name\", \"description\": \"some desc\", \"urls\": [\"file:///home/ubuntu/stuff/ok.json\"]}"
{
"id": "0ce725ef413b4000",
"orgID": "697959a536e728a2",
"createdAt": "2024-04-17T10:08:42.116599762-05:00",
...
$ influx apply -o $ORG --stack-id 0ce725ef413b4000
LABELS +add | -remove | unchanged
+-----+---------------+------------------+---------------+---------+-------------+
| +/- | METADATA NAME | ID | RESOURCE NAME | COLOR | DESCRIPTION |
+-----+---------------+------------------+---------------+---------+-------------+
| + | label-1 | 0000000000000000 | label-1 | #eee888 | desc_1 |
+-----+---------------+------------------+---------------+---------+-------------+
| TOTAL | 1 |
+-----+---------------+------------------+---------------+---------+-------------+
...
$ influx stacks rm --stack-id 0ce725ef413b4000
Now try stacks with 'hardening-enabled = true':
# the curl is expected to pass
$ curl --insecure -H "Authorization: Token $TOKEN" --request POST "$URL/api/v2/stacks" --data "{\"orgID\":\"$ORGID\", \"name\":\"some name\", \"description\": \"some desc\", \"urls\": [\"file:///home/ubuntu/stuff/ok.json\"]}"
{
"id": "0ce72690ee408000",
"orgID": "697959a536e728a2",
"createdAt": "2024-04-17T10:11:27.673418946-05:00",
...
# the apply is not
$ influx apply -o $ORG --stack-id 0ce72690ee408000
Error: failed to check template impact: 400 Bad Request: invalid URL scheme: invalid URL scheme
$ influx stacks rm --stack-id 0ce72690ee408000
Now try stacks with just 'template-file-urls-disabled = true':
# the curl is expected to pass
$ curl --insecure -H "Authorization: Token $TOKEN" --request POST "$URL/api/v2/stacks" --data "{\"orgID\":\"$ORGID\", \"name\":\"some name\", \"description\": \"some desc\", \"urls\": [\"file:///home/ubuntu/stuff/ok.json\"]}"
{
"id": "0ce7271e9d86b000",
"orgID": "697959a536e728a2",
"createdAt": "2024-04-17T10:11:27.673418946-05:00",
...
# the apply is not
$ influx apply -o $ORG --stack-id 0ce7271e9d86b000
Error: failed to check template impact: 400 Bad Request: invalid URL scheme: invalid URL scheme
$ influx stacks rm --stack-id 0ce7271e9d86b000
Now try /api/v2/templates/apply directly with neither hardening-enabled nor template-file-urls-disabled set:
# test http:// still works with /api/v2/templates/apply
$ curl --insecure -H "Authorization: Token $TOKEN" -H "Content-Type: application/json" --request POST "$URL/api/v2/templates/apply" --data "{\"orgID\": \"$ORGID\", \"dryRun\": true, \"remotes\": [{\"url\": \"http://localhost:8000/ok.json\"}]}"
{
"sources": [
"http://localhost:8000/ok.json"
],
"stackID": "",
...
# influx apply --encoding json -f works too # the -f path is on the client, not server
$ influx apply --encoding json -f ~/stuff/ok.json
LABELS +add | -remove | unchanged
+-----+---------------+------------------+---------------+---------+-------------+
| +/- | METADATA NAME | ID | RESOURCE NAME | COLOR | DESCRIPTION |
+-----+---------------+------------------+---------------+---------+-------------+
| + | label-1 | 0000000000000000 | label-1 | #eee888 | desc_1 |
+-----+---------------+------------------+---------------+---------+-------------+
| TOTAL | 1 |
+-----+---------------+------------------+---------------+---------+-------------+
...
# now try with file:///home/ubuntu/stuff/ok.json
# Curiously, I was thinking should've been allowed since neither hardening-enabled nor
# template-file-urls-disabled was specified in the config file
$ curl --insecure -H "Authorization: Token $TOKEN" -H "Content-Type: application/json" --request POST "$URL/api/v2/templates/apply" --data "{\"orgID\": \"$ORGID\", \"dryRun\": true, \"remotes\": [{\"url\": \"file:///home/ubuntu/stuff/ok.json\"}]}"
{
"code": "unprocessable entity",
"message": "template from url[file:///home/ubuntu/stuff/ok.json] had an issue: Get \"file:///home/ubuntu/stuff/ok.json\": unsupported protocol scheme \"file\""
I then tried with the official 2.7.6-1 amd64 deb, and it also fails to process file:///home/ubuntu/stuff/ok.json
with /api/v2/templates/apply. I then tried 2.1.1 (which predates #23030, which I though might've changed the behavior) and it fails in the same way.
So, /api/v2/stacks is working as designed by the PR. /api/v2/templates/apply unconditionally denies use of file:/// URLs and has for years. The difference in behavior is odd and makes me want to say that we should unconditionally disallow file:/// URLs in both for consistency, but we know that file:/// URLs are used with stacks, so I don't think we can do that.
Long story short, this PR is functionally correct.
pkg/fs/special_linux.go
Outdated
return true, nil | ||
} | ||
// It's a special file unless the device ID matches /tmp's ID. | ||
return devId != tmpDevId, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there will be problems with this implementation if the user specifies a file in /dev/shm
or if https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#TemporaryFileSystem= is used and the user specifies something in there. The latter is rather esoteric and unlikely but the former seems plausible. I'm not sure if we should try to account for /dev/shm (note that some systems mount /dev/shm with a symlink from /run/shm to /dev/shm while other (older) systems do the opposite) or just mention it in a release note. A release note is probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, /dev/shm
is only for POSIX shard memory, and it's not specifically in FHS. /run
is definitely interesting, though, because it is almost always tmpfs and is guaranteed by FHS. It's also what systemd says you should use instead of /dev/shm
. I'm thinking we should have an additional special case check for /run
. We could also add a check against /dev/shm
if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know some people have gotten /dev/shm (or /run/shm) under their fingers for having a reliably memory-backed sticky dir (/tmp-style) on Linux (eg, to generate private keys to keep off disk). That said, it feels unlikely that someone would put a template in /dev/shm (or /run/shm on a server) to then start running stacks command for influxdb. In thinking about this more, at most this is a release note IMHO.
As for /run
, it's not impossible to imagine that people might put things under /run/user
or /run/something
. influxdb doesn't have a /run/influxdb directory with 2.x and /run/user is going to be challenging to use for users since it is only accessible to the user (and not the influxdb user) so I don't think this is required.
Improve special file system exception checking for tmpfs on Linux to cover most common tmpfs mounts. Improve help for "--hardening-enabled" command line option. Improve some comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably fine as is, but I have a small pedantic suggestion for /dev/shm
and a few comment update suggestions. I'm not sure if the tests are flaky (I didn't check), but fyi, not all the tests are passing.
// the device ID won't match so we won't grant a tmpfs exception based on it. | ||
// On Linux, every tmpfs mount has a different device ID, so we need to check | ||
// against all common ones that might be in use. | ||
tmpfsMounts := []string{"/tmp", "/run", "/dev/shm"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/dev/shm
is a symlink to /run/shm
on old Linux's (eg, Debian Stretch, but there are others). Ubuntu 16.04 and later is known to have /dev/shm
as the mountpoint (not a symlink). I did find reports of WSL in 2022 that had /dev/shm
as a symlink. I think it would be good to handle that case too. AIUI, the old systems simply symlinked /dev/shm
to point to the shm
directory under the /run
tmpfs (ie, /run/shm
wasn't its own tmpfs). I think simply conditionally adding "/dev/shm" to tmpfsMounts
if it is not a symlink is all that is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The good news is that the check for the special filesystem and tmpfs exceptions is against the target of the symlink, so the same code should work for that situation. I've added test cases with symlinks as well.
fDevId, err := getDevId(fSt) | ||
if err != nil { | ||
// See above for why we're returning an error here. | ||
return math.MaxUint64, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "See above for why we're returning an error here" but we're return nil
for error
which is a little confusing for the reader. Can you rephrase the comment for clarity? (Not to mention, it seems some of the supporting comments for this are also 'below', not just 'above').
} | ||
for _, fn := range tmpfsMounts { | ||
// We ignore errors if getFileDevId fails, which could | ||
// mean that the file (e.g. /run) doesn't exist. The error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment ends with a sentence fragment.
// We got an error getting stats on /tmp, but that just means we can't | ||
// say the file is in tmpfs. We'll still go ahead and call it a special file. | ||
return true, nil | ||
// for each common tmpfs mount point, If the mount point's device ID matches this st's, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/point,/point./
?
Stacks and templates allow specifying file:// URLs. Add option to disable their use for people who don't require them.