[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Background fsck



Chris Pressey wrote:

On Wed, 21 Jan 2004 15:20:10 -0500
Gary Thorpe <gathorpe79@yahoo.com> wrote:


There was a thread on tech-kern@netbsd.org regarding relocation of bad
sectors and caching and the informal observations were that some IDE drives LIE about this when the cache is enabled. That's such a
wonderful improvement don't you think?



Right up there with fake-parity memory, I'd have to say.



In fact, without hardware being able to reliably and truthful inform
the OS on what is happening, no filesystem can guarantee anything.


Which is kind of what I was trying to get at. If drive manufacturers want to go this route and be taken seriously, they're going to have to start adding stuff that's normally in the filesystem to the drive itself.

(More likely they don't fear not being taken seriously, I grant you.)


But ATA gets high sequential transfers so I guess thats all that
matters.


Not to me or you, obviously, but most people apparently have a high tolerance for crap.


I have never used a PC without an ATA disk: they are cheap, ubiquitous and reliable enough for home use. However, even a cheap disk should operate properly and not lie about what it is doing: even if it tells the OS that cache flushing was ineffective for this particular drive, that would be better than pretending it actually did something. That just seems like an incomplete product: not all IDE disks are that crappy, it seems like just sloppy work from the company.

Of course, I have never heard of modern SCSI disks doing any of this regardless.



Existing designs may be MORE complex and harder to maintain than
softdeps: do you want that?


Hell no. That's like the last thing I want.


But if someone else wants to use (say) EXT3 or ReiserFS with DragonFly,
I wouldn't want to stop them.  Especially considering these are already
maintained by other parties.


I think ext3 would be easiest (ext2 is already available across the BSD's), but not technically the best. I think ReiserFS wants to grow into a database or some unified name space so..... What I would be interested to know if implementing some of them under a BSD license might peeve the original owners?



If it is necessary, design one from scratch using principles the
others explored/built on. Its not impossible.


No, but it's extra work, and it isn't necessary.



Since the VFS is NOT the major obstacle to supporting jounraling


(As I said, I don't really care about journalling.)



(almost all discussions I have seen end with "lets get LFS working
right instead " which implies that the people who will actually decide
want to keep it a BSD-based) and the VFS systems in all the BSDs
already support multiple file system, I don't see where you are going?
Do you mean to make them more modular/flexible to allow module loading
unloading and dynamic addition of filesystems?


I mean, make it easier to port other filesystems to it.


As it stands, the VFS does support other filesystems, but poorly.  I'm
not sure how much of this has to do with UFS being regarded, in large
part, as the One True Filesystem for FreeBSD, and how much of it is
strictly technical.


This is what I am hinting at: is the lack of file system choices technical or philosophical? If it is the latter, it won't matter how the VFS is....


Do you honestly think Linux has a good design for this or is it a hack
(I don't know I am asking)?


I don't really know either, but my impression is that there's way less cruft in it, even if the design isn't any better.


Sure.  But journalling != atomicity, and I don't care nearly as much
about the former as I do the latter.

Yes journaling is atomicity:


No. Journalling implies atomicity, but atomicity doesn't imply journalling.


either a change makes it into the log or
it doesn't, or at least thats the impression I get on HOW they
_should_ be designed (with journaling commits being atomic).
The other alternative is to try and get EVERY file system operation to
be atomic, which will probably be infeasible or completely destroy
performance. Disks can only guarantee that small blocks are
read/written atomically, so could you please elabourate on how this
would work?


I have no idea. But I have no reason to believe journalling is the only feasible option. And as I said, I don't really care.


Suppose the driver has a bug which cause the kernel to use an

invalid >pointer: since most OS's are still monolithic, you are more
unsure>about what you may have just corrupted (including FS code).

Or suppose the kernel just refuses to use the invalid pointer.


Or suppose it IS valid, but it points to the wrong data and you overwrite something and it is only caught later? What error handling
can you do: the error is asynchronous as it will either go undetected immediately and be revealed later OR it will cause a trap. Unless you want to add exception handling to a kernel, there is not much else you
can do if the error occurs in the same module as the core kernel (as
in a monolithic and not in a microkernel, although faults within a microkernel and not one of the servers would have the same result).



So why *not* add some form of exception handling to the kernel? At least for the things on the border between the kernel and the rest of the world, like device drivers and filesystems.

The answer is usually "because we don't want to take that sort of
performance hit."

Which is fine as long as some level of performance is acknowledged as
being a higher priority than some level of reliability.


Exceptions are supposed to be exceptional :-) I don't think it matters if it takes a longer time to recover from what would otherwise be a fatal error. Exceptions are _supposed_ to be implemented in such a way that the common, normal code path sees a minimal performance hit.

Actually, would it probably be faster than doing rigorous assertions/checks on each pointer?



I suppose it would be interesting to people working on fault tolerance/corrections in things like space exploration, but I doubt there is enough will to get it working on even commercial systems.


Heck, if there isn't even enough will to manufacture "honest" hard drives and memory for commercial systems, then there certainly isn't going to be enough will to build reliable software for them, right?


To be fair, not all IDE disks lie, and IDE is really a consumer product. The fake parity memory is a very valid example though...so I have to agree. However, languages do support the concept (Java, C++, ADA, are some). The problem is how to move the concepts from applications to system level software. ADA probably is already doing this, but no one outside of aerospace/military seems to use it.

Why does this matter? Who needs a cheap slogan anyway?


I think you missed my point - the trite sig was only to illustrate. Cheap slogans don't matter, but philosophy does, and IMO DragonFlyBSD's
philosophy could stand to be clearer.


-Chris


I get the impression the philosophy is to actually use DragonFlyBSD to test new concepts and develop new approaches to produce a better BSD solution. Best of breed? Research platform?