Posts made by scottalanmiller

scottalanmiller

@JasGot said in Is the domain .Local a real problem in a private lan that has no public facing services?:

@scottalanmiller said in Is the domain .Local a real problem in a private lan that has no public facing services?:

Nothing in the Windows world uses it. Mac uses it, and some isolated Linux stuff. Really minor in most cases.

https://en.wikipedia.org/wiki/.local

That what I had read also. Thanks. They are keeping .local during the transition!

If it causes any problems (unlikely), there are workarounds too. It's never a show stopper.

scottalanmiller

@WLS-ITGuy Typically we use the time to rebuild. Any VM should, in theory, be quick and easy to rebuild. If it isn't, it's the perfect time to make it so.

If you absolutely have to convert rather than move through application migration, then there are disk conversation tools that change the format. There's very little needed for the migration.

scottalanmiller

@EddieJennings said in ZFS Pool Online but Cannot Import:

You may want to seek out Jim Salter's content concerning ZFS. This is the community he's started since leaving the ZFS subreddit.

https://discourse.practicalzfs.com/

Like everywhere else, not one single thing similar to this issue

scottalanmiller

Nothing in the Windows world uses it. Mac uses it, and some isolated Linux stuff. Really minor in most cases.

https://en.wikipedia.org/wiki/.local

scottalanmiller

@JasGot no, it's fine. Almost nothing ever uses that and it was best practice for a long time.

scottalanmiller

Just a quick update. We are imaging the drives, converting the images to qcow2, mounting to an Ubuntu desktop and UFS Explorer is, so far, able to see the data in them. Not ideal, but it's working so far.

https://www.ufsexplorer.com/articles/how-to/recover-data-zfs-volume/

scottalanmiller

hating ZFS more and more each day, lol

scottalanmiller

@IThomeboy80 said in Not much luck with Linux Distro's:

You may want to go with other Linux distros (i.e. Ubuntu or Alma Linux)

Ubuntu is what he started with.

Alma isn't for desktop use (or IMHO production use of any kind. It's an LTS only kludge for bad software shops.)

scottalanmiller

@travisdh1 said in ZFS Pool Online but Cannot Import:

@scottalanmiller said in ZFS Pool Online but Cannot Import:

One big thing we've learned about ZFS risks is that it forces a situation where we are dealing with enormous pools of block data in order to do anything and the ability to copy, image, move, backup and so forth is heavily curtailed by the fact that we are forced to work at the array level before ZFS merges the RAID, LVM and filesystem layers together into a single monolith that, if it fails, leaves you so dramatically exposed.

Yep. Just because LVM and MD are separate things, that's not necessarily a bad thing. Especially if you've got devices that can change where they are in the /dev system.

Really, it's a very important good thing. ZFS merging that all together adds so much confusion and risk exposure, it's nuts. There is a reason that no production storage ever has done that.

scottalanmiller

One big thing we've learned about ZFS risks is that it forces a situation where we are dealing with enormous pools of block data in order to do anything and the ability to copy, image, move, backup and so forth is heavily curtailed by the fact that we are forced to work at the array level before ZFS merges the RAID, LVM and filesystem layers together into a single monolith that, if it fails, leaves you so dramatically exposed.

scottalanmiller

Current status... getting additional drives mounted so that we can take block level images of these devices so that we can more safely experiment.

scottalanmiller

Some useful nuggets of info from this link..

https://serverfault.com/questions/656073/zfs-pool-reports-a-missing-device-but-it-is-not-missing

This bit from "Jim" in 2020 is super useful for background...

I know this is a five year-old question, and your immediate problem was solved. But this is one of the few specific search results that come up in a web search about missing ZFS devices (at least the keywords I used), and it might help others to know this:

This specific problem of devices going "missing", is a known problem with ZFS on Linux. (Specifically on Linux.) The problem, I believe, is two-fold, and although the ZOL team could themselves fix it (probably with a lot of work), it's not entirely a ZOL problem:

While no OS has a perfectly stable way of referring to devices, for this specific use case, Linux is a little worse than, say, Illumos, BSD, or Solaris. Sure, we have device IDs, GUIDs, and even better--the newer 'WWN' standard. But the problem is, some storage controllers--notably some USB (v3 and 4) controllers, eSATA, and others, as well as many types of consumer-grade external enclosures--either can't always see those, or worse, don't pass them through to the OS. Merely plugging a cable into the "wrong" port of an external enclosure can trigger this problem in ZFS, and there's no getting around it.
ZOL for some reason can't pick up that the disks do actually exist and are visible to the OS, just not at any of the previous locations ZFS knew before (e.g. /dev, /dev/disk/by-id, by-path, by-guid, etc.) Or the one specific previous location, more to the point. Even if you do a proper zpool export before moving anything around. This is particularly frustrating about ZOL or ZFS in particular. (I remember this problem even on Solaris, but granted that was a significantly older version of ZFS that would lose the entire pool if the ZIL went missing...which I lost everything once to [but had backups].)

The obvious workaround is to not use consumer-grade hardware with ZFS, especially consumer-grade external enclosures that use some consumer-level protocol like USB, Firewire, eSATA, etc. (External SAS should be fine.)

That specifically--consumer grade external enclosures--has caused me unending headaches. While I did occasionally have this specific problem with slightly more "enterprise"-grade LSI SAS controllers and rackmount chassis with a 5x4 bay, moving to a more portable solution with three external bays pretty much unleashed hell. Thankfully my array is a stripe of three-way mirrors, because at one point it literally lost track of 8 drives (out of 12 total), and the only solution was to resilver them. (Which was mostly reads at GBs/s so at least it didn't take days or weeks.)

So I don't know what the long-term solution is. I wouldn't blame the volunteers working on this mountain of code, if they felt that covering all the edge cases of consumer-grade hardware, for Linux specifically, was out of scope.

But I think that if ZFS did a more exhaustive search of metadata that ZFS manages itself on each disk, would fix many related problems. (Btrfs, for example, doesn't suffer from this problem at all. I can move stuff around willy-nilly completely at random, and it has never once complained. Granted, Btrfs has other shortcomings compared to ZFS (the list of pros and cons is endless), and it's also native Linux--but it at least goes to show that the problem can, in theory, be solved, at least on Linux, specifically by the software itself.

I've cobbled together a workaround to this problem, and I've now implemented on all my ZFS arrays, even at work, even on enterprise hardware:

Turn the external enclosures off, so that ZFS doesn't automatically import the pool. (It is frustrating that there still seems to be no way to tell ZFS not to do this. Renaming the cachefile or setting it to "none" doesn't work. Even without the addressing problems, I almost never want the pools to auto-mount but would rather an automatic script do it.)
Once the system is up and settled down, then turn on the external enclosures.
Run a script that exports and imports the pool a few times in a row (frustratingly sometimes necessary for it to see even legit minor changes). The most important thing here, is to import in read-only mode to avoid an automatic resilver kicking off.
The script then shows the user the output of zpool status of the read-only pool, and prompt the user if it's OK to go ahead and import in full read-write mode.

Doing this has saved me (or my data) countless times. Usually it means I have to move drives and/or usually just cables around, until the addressing gets back to where it was. It also provides me with the opportunity to try different addressing methods with the -d switch. Some combination of that, and changing cables/locations, has solved the problem a few times.

In my particular case, mounting with -d /dev/disk/by-path is usually the optimal choice. Because my former favorite, -d /dev/disk/by-id is actually fairly unreliable with my current setup. Usually a whole bay of drives are simply missing entirely from the /dev/disk/by-id directory. (And in this case it's hard to blame even Linux. It's just a wonky setup that further aggravates the existing shortcomings previously noted.)

Sure, it means the server can't be relied upon to come up automatically without manual intervention. But considering 1) it runs full-time on a big battery backup, 2) I've knowingly made that tradeoff for the benefit of being able to use consumer-grade hardware that doesn't require two people and a dolly to move... that's an OK tradeoff.

scottalanmiller

We have a ZFS pool from a ProxMox server that died (not an install that we did.) There is no backup (not an environment we set up.) The drives didn't fail, they are clean and healthy. We moved the drives to another host and they show up fine. Everything registers fine. We do an import and we can see the pool to import but when we import it we get "one or more devices is currently unavailable", even though it clearly shows that they are available.

There used to be more pools showing in this as well. Others have disappeared over time. Originally these all imported with only minor problems. But they've stopped importing. The device names have changed over time, too. But they are correct.

root@pve1:/usr/local/mesh_services/meshagent# zpool import
   pool: rpool-pmx3
     id: 9234020319468906434
  state: ONLINE
status: The pool was last accessed by another system.
 action: The pool can be imported using its name or numeric identifier and
        the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-EY
 config:

        rpool-pmx3  ONLINE
          mirror-0  ONLINE
            sdb3    ONLINE
            sdc3    ONLINE
          mirror-1  ONLINE
            sdd     ONLINE
            sde     ONLINE
root@pve1:/usr/local/mesh_services/meshagent# zpool import -f rpool-pmx3
cannot import 'rpool-pmx3': one or more devices is currently unavailable

scottalanmiller

@JasGot said in Sharepoint - Hybrid / Mixed solution?:

I'm open to even the wildest ideas.

Is it only MS solutions for documents that they want, or are other options on the table too? Assuming so, but would be foolish not to check.

scottalanmiller

Working on a ZFS disaster.

scottalanmiller

@coliver said in Not much luck with Linux Distro's:

@JaredBusch https://nobaraproject.org/. It's from the main developer behind Proton. I've been running it since it came out.

You mean... Steam?

scottalanmiller

Something that I just had to say to someone in Montana dealing with this problem...

They assumed no one was reporting the issue, since the state hadn't fixed it. But in reality, normal reports of the site being down wouldn't inform them of much. ANd when the hosting people check and the site is up, likely they'd ignore the reports.

The problem with things like geo-IT blocking is that anyone that would use that as a tool thinking it had some value, would naturally have little chance of being able to understand when or why it wouldn't work. If they had the ability to troubleshoot it, they've have the knowledge that would have told them it was never okay to use in the first place.

scottalanmiller

Awesome example happening right now.

https://montanastatefund.com/

For no good reason, this is geo-blocked. And in the worst way, without stating the issue but presenting the site as being offline. Works from some places in Montana. Works from some places in California. Blocked in Nicaragua or Bolivia where we casually tested.

Now before someone makes an insane excuse that there is no reason for those places to use that site, keep in mind that the Montana companies IT team members are in those locations being asked to deal with an issue that involves that site not working. And keep in mind that people from Montana are, presumably, allowed to travel. So any suggestion that there is never a need to see a state government site outside of that state, or the US, is ridiculous and hopefully no one would ever suggest such a thing. Obviously government resources are some of the most important things to be available to US citizens and US businesses when using IPs that aren't listed as being in the US.

So what is the actual problem? In blocking "other countries", that state accidentally blocked some ISPs in Montana, too. We know this because we have sites in Montana with dual ISPs. And on one ISP it just works, on another, it is blocked. Both are Montana IPs. But people on the one ISP don't get told that the resources is blocked, they don't get told what to do, they are simply shown that the resources is offline. That's a huge problem as normal people wouldn't even know to work around a broken geo IP block. Especially when they are in the same state.

The risks to geo IP blocking are big. The benefits.. are simply lies. There are none.

This is a great example where the technical reasons often listed for why you might want to geo-IP block can easily be shown to actually be reasons why you can't.

scottalanmiller

@Obsolesce said in Is it racist? I think it is.:

@scottalanmiller said in Is it racist? I think it is.:

There's never a technical reason. We've been discussing this for years. It's common IT knowledge that there is no technical reason to geo-IP block as it doesn't do what the name implies.

It is almost always a technical reason, if ever a racially motivated one. Technical as in one or more of the reasons (not an exclusive list either) I listed in my first post.

That's obviously false as there IS no technical reason to do so. Never once have I ever heard any plausible technical reason ever suggested. But tons of "just bad business" and sometimes illegal issues with blocking. Your list of potential reasons contained zero actual viable options. None of those were true or would meet the requirements. Saying "it's almost always technical" when no known technical reason even exists, is quite the stretch. Especially when, when confronted, zero examples of "it's technical" and always "we don't want to do business with 'those people'" have been given in real life.

What is your basis for this statement? How could it possible be plausible? Your first post IS the perfect example. You couldn't come up with a single real world possible reason. We pointed out that none of those apply to any actual scenario that anyone could think of.

scottalanmiller

Man, I wish I was there