Mailing List Archive: Slightly OT: Powering up remotely

Slightly OT: Powering up remotely

dheianevans at gmail

Sep 21, 2020, 11:13 AM

Post #1 of 13 (1188 views)

Away from home for a bit and have been logging in remotely to check on my
MythTV system, which also runs Zoneminder.

Last night while on the system, I suddenly lost the connection. When my
remote internet connection came back (the ISP can see the modem on their
network) I couldn't login to the Myth box so I think that must have gone
too.

It's connected to a surge protector but currently not to a UPS. The bios is
set to autostart when the power comes on.

1) Let's say it auto-started but is currently sitting there with some
"press y to continue" drive/os message. To future proof this, aren't there
some motherboards that allow you to have remote KVM access through the bios?

2) let's say you have a UPS with shutdown software. It goes on battery.
Safely shuts down Linux box. Power comes back before the battery has
drained. I'm assuming the computer won't autostart because the power to it
never stopped. Anyone ever tried using a WiFi plug to power the computer
off to force an autostart?

Thanks for any advice.

Re: Slightly OT: Powering up remotely [ In reply to ]

jfabernathy at gmail

Sep 21, 2020, 11:46 AM

Post #2 of 13 (1188 views)

On 9/21/20 2:13 PM, Ian Evans wrote:
> Away from home for a bit and have been logging in remotely to check on
> my MythTV system, which also runs Zoneminder.
>
> Last night while on the system, I suddenly lost the connection. When
> my remote internet connection came back (the ISP can see the modem on
> their network) I couldn't login to the Myth box so I think that must
> have gone too.
>
> It's connected to a surge protector but currently not to a UPS. The
> bios is set to autostart when the power comes on.
>
> 1) Let's say it auto-started but is currently sitting there with some
> "press y to continue" drive/os message. To future proof this, aren't
> there some motherboards that allow you to have remote KVM access
> through the bios?
>
> 2) let's say you have a UPS with shutdown software. It goes on
> battery. Safely shuts down Linux box. Power comes back before the
> battery has drained. I'm assuming the computer won't autostart because
> the power to it never stopped. Anyone ever tried using a WiFi plug to
> power the computer off to force an autostart?
>
> Thanks for any advice.

I just did some testing on this after I had my 7 year old UPS battery
die and I replaced the whole thing.

1.) I use to work for Intel back prior to 2013 and is those days you
could buy a business motherboard with the Management engine and 2nd
Ethernet connector so you could run network management remotely that
included everything from KVM to re-flashing the BIOS. At that time
corporate IT loved those features so they could do everything remotely.
Now that's 7 year old information, so your mileage may very.

2.) I have my UPS software set to let the system run until a critical
low battery alert is received and the shutdown issued. I also have the
BIOS set that will always power on if power is restored after a
failure. So most of the time the sequence is the power is lost, and
either recovers quickly 5-10 minutes and nothing happens except some
warning emails. Of if the power gets critically low, the UPS shuts down
the computer but other peripherals keep draining the battery and then
the UPS is completely dead. Now when Power is restored the BIOS
registers this as a previous power failure and immediately boots.

There is a small window of time between the PC being shutdown, but there
is still power coming from the UPS that, if the power to the house is
restored, there PC will remain off because it never saw the power failure.

My Netgear Wireless AP router has Openvpn built in and I can easily
control my Mythtv backend remotely, except when I had the problem where
the power is restored after the UPS has shutdown the PC and power was
restored before it drained completely.

You could solve this with a Corporate motherboard or as I do I give my
neighbor a bottle of wine every time he goes into my house to power up
my PC after that problem occurs. So far that's been the most economical
solution. $30 for a decent bottle of wine, $200 for a new motherboard +
$200 for Management capable CPU and chipset, etc. I had this happen once
in 3 years, so I'm ahead of the game.

Jim A

_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

gary.buhrmaster at gmail

Sep 21, 2020, 12:11 PM

Post #3 of 13 (1188 views)

On Mon, Sep 21, 2020 at 6:13 PM Ian Evans <dheianevans@gmail.com> wrote:

> 1) Let's say it auto-started but is currently sitting there with some "press y to continue" drive/os message. To future proof this, aren't there some motherboards that allow you to have remote KVM access through the bios?

Yes, although most are targeted towards enterprise
or datacenter customers, and are priced accordingly.
Some business class MBs may offer remote
management too (Intel calls it their management
engine, I forgot what AMD calls their solution).

There are also 3rd parties that sell KVM solutions
for those MBs that do not come with the support.
As most of these KVM are target towards enterprise
and datacenter servers, it is not uncommon that
they support VGA ports commonly found in the
embedded GPUs of these class servers.

One can also use remote serial console support
through your serial console concentrator, but
(a) you have to make sure to configure your
kernel correctly, and (b) you have to have your
own serial console concentrator (again, common
in enterprises/data centers, not so much for
home, although there are many solutions
available).

> 2) let's say you have a UPS with shutdown software. It goes on battery. Safely shuts down Linux box. Power comes back before the battery has drained. I'm assuming the computer won't autostart because the power to it never stopped. Anyone ever tried using a WiFi plug to power the computer off to force an autostart?

Enterprise/DataCenter UPS units (and/or
power distribution units) allow you to
power cycle individual ports via network
control.

That all said, in this case, one should look
at the root cause. It is possible none of
the remote control solutions would have
mattered/helped, or been worth the
expense to add just to be able to solve
a once in an every five year event.
_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

Sep 21, 2020, 4:01 PM

Post #4 of 13 (1188 views)

> On 22 Sep 2020, at 2:13 am, Ian Evans <dheianevans@gmail.com> wrote:
>
> 1) Let's say it auto-started but is currently sitting there with some "press y to continue" drive/os message. To future proof this, aren't there some motherboards that allow you to have remote KVM access through the bios?

There are grub kernel settings
fsck=force fsck=repair

The logic being either the disk is trashed and saying y or n during fsck wont help you, or saying y will fix your disk. In any event (no spinning rust) I do noy find fsck obtrusive. Im using EXT4

James

_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

dheianevans at gmail

Sep 22, 2020, 9:15 AM

Post #5 of 13 (1186 views)

On Mon, Sep 21, 2020 at 7:02 PM jam@tigger.ws <jam@tigger.ws> wrote:

>
>
> > On 22 Sep 2020, at 2:13 am, Ian Evans <dheianevans@gmail.com> wrote:
> >
> > 1) Let's say it auto-started but is currently sitting there with some
> "press y to continue" drive/os message. To future proof this, aren't there
> some motherboards that allow you to have remote KVM access through the bios?
>
> There are grub kernel settings
> fsck=force fsck=repair
>
> The logic being either the disk is trashed and saying y or n during fsck
> wont help you, or saying y will fix your disk. In any event (no spinning
> rust) I do noy find fsck obtrusive. Im using EXT4
>
>
Thanks to everyone for the input. It'll be a while until back to my system.
It's funny how you dont realize how smooth things were running until
they're not.

Now I'm curious if anyone has ever run a low-powered backup myth that kicks
in if the main one fails to keep the family off their backs.:-) "Why didn't
my show record!!!" :-P

Re: Slightly OT: Powering up remotely [ In reply to ]

chris at vindaloo

Sep 22, 2020, 7:25 PM

Post #6 of 13 (1185 views)

On Tue, Sep 22, 2020 at 12:15:19PM -0400, Ian Evans wrote:
>
> ============================================================
>
> On Mon, Sep 21, 2020 at 7:02 PM jam@tigger.ws <jam@tigger.ws> wrote:
>
>
>
> > On 22 Sep 2020, at 2:13 am, Ian Evans <dheianevans@gmail.com> wrote:
> >
> > 1) Let's say it auto-started but is currently sitting there
> > with some "press y to continue" drive/os message. To future
> > proof this, aren't there some motherboards that allow you
> > to have remote KVM access through the bios?
>
> There are grub kernel settings
> fsck=force fsck=repair
>
> The logic being either the disk is trashed and saying y or n
> during fsck wont help you, or saying y will fix your disk. In
> any event (no spinning rust) I do noy find fsck obtrusive. Im
> using EXT4
>

I'm curious to see what this is when you get back to investigate. In
my experience it could be something very simple. In my recent
experience it wasn't but that's another story. The sequence of events
to restore power for systems held can be more complicated than it
would seem. Failures in the process could lead to a remote machine
being in a safe completely off state.

Three things might help you here:

- IPMI
- A remote switchable AC power unit
- A Serial Console on your MythTV and a "terminal server"

If you had either IPMI or a Serial console and a remote switchable
plug for your MythTV server, you would have some options. IPMI is what
you are talking about. It lets you get into your server before the
operating system boots up. A remote switchable power plug would let
you turn the server off and on from a remote location. Assuming the
BIOS is setup to automatically turn the power on after a power cut,
that should boot the server. A serial console and a terminal server
from which to access it would allow you to control the server after
the operation system boots even if it gets stuck before being able to
go multi-user. If your server has a serial console which
you can access remotely, you can use the serial console to answer the
fsck questions and possibly fix the filesystem and then transition
back to running.

Without going into great detail, there are plenty of remote switchable
AC plugs that you can find. So long as you can figure out a way to
remotely access it from the internet, that would help. Best here would
be something like an APC AP7900 Switchable PDU. You can get those on
eBay for between US $50 and $120. I'm assuming that you are in the US
here. I don't know what they would go for on the other side of the
pond.

As for serial consoles, they should be possible in Linux. My rough
guess as to cost is again about $100. This sounds expensive but you
need both a serial console and another server from which you access
the console. My $100 estimate:

- puts a PCIe serial card in your MythTV;

- purchases a Raspberry Pi and the other dedicated hardware that you
need to access the MythTV console.

The basic operation would then be to use ssh into the raspberry
pi; then use a serial access program like `cu` to jump from the pi's
ssh command line to the serial console.

In summary, there's no way to know what the right solution is until
you get back home because right now you don't know enough about how
the system failed.

--
Chris

__o "All I was trying to do was get home from work."
_`\<,_ -Rosa Parks
___(*)/_(*)____.___o____..___..o...________ooO..._____________________
Christopher Sean Hilton [chris/at/vindaloo/dot/com]
_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

linux at thehobsons

Sep 23, 2020, 3:16 AM

Post #7 of 13 (1182 views)

This probably isn't going to be all that helpful, but as the others have already said - "it's complicated", and it can easily end up costing more than it's worth where "cost" can be in monetary terms or in other problems. But a few thoughts in no particular order ...

RAID 1 for your OS (/, /boot, /var, ...) etc - that's a no-brainer. I can't begin to count the number of times that's saved my backside over the years.

A key factor is your local conditions - what may work for me (all underground power supplies, very reliable mains power) may not be suitable for someone else (e.g. long overhead power supplied, "out in the sticks", unreliable mains). My last two jobs have encompassed both of those !

At my last job (IT services company) we were like I am at home - all underground power supplies and very reliable mains. For a long time we had no working UPS in the server room and I had servers with uptimes of over a year. In some ways, we actually had more problems caused by the UPSs over the years than we had due to mains failure - and that was echoed at some of our customers where UPS failure was sometimes more frequent than mains failures.
Where servers had dual power supplies (and were capable of running off one), we'd generally recommend connecting one via the UPS, and the other direct to the mains. That way, if the mains failed the UPS could keep the server going, but of the UPS failed the mains could keep the server going. But that has it's own problems as the UPS doesn't see the true load until the power fails, so you get erroneous estimates of run time.

At the other end of the spectrum, at my previous job (small manufacturing company) we were on the end of overhead power lines that seemed (I once got to have a peek at the network maps) to come via every small village in the area. Power cuts were frequent, and working UPS(s) were essential - though still a tough sell to manglement who had to sign off the expenditure.

As an aside, at one point when things were particularly bad, I was able to classify power cuts into 3 main groups :
The very short one's where power would come back after a few seconds (auto-reclose on breakers)
Ones that were within a few minutes of 90 minutes long. The DNO engineers (a family member was one of them) were given a target of 90 minutes to restore supplies where a physical visit to a substation was needed. I suspect that they had an unwritten rule not to respond too quickly lest their manglement cut the allowed time - "you did it in 50 minutes the last few times, your target is now cut to 60 minutes"
Ones that were "long" - where presumably there was more to restoring supplies than just switching it back on. I think the longest we had was something like 4 hours.

At that earlier job, I had looked into some automation, but in the end it just ended up too much hassle for too little reward. I had got a UPS that could run everything for 30 to 60 mins, but if I shut down "non-essentials" could keep the phone system and main server (plus remote access to it) going for a few more hours. Look into many businesses and you'll find stuff that isn't needed if the lights are out - there won't be desktop users with power so they can't use it, and back then we had few laptops. But we did have remote sites that used the main system - so keeping the core network and that going meant they could do work.

At the time, a "sales person" from APC absolutely assured me that their ShareUPS unit could do what I wanted - shut down selected servers AND REMOVE POWER*, leaving others going until the batteries were lower, and then start up anything that had been shut down. Unfortunately, what he omitted to mention that their way of starting things back up again was to shut everything down and power cycle the UPS output - and no it couldn't power off individual items. In effect, all it dod was multiply the number of servers that could connect to the UPS for basic signalling.
They did have a product in the US which I got hold of which combined the signalling with power switching - but as I mentioned, you get into a situation where the risks of failure of that get to be higher than the risks you are mitigating - and with the variety of systems (some of them Macs with no serial ports) made utilising the basic signalling difficult.

* Our main server didn't power itself off after an OS shutdown - so little power saving.

So you end up with trade offs like :

You give yourself remote access to server management ports. Great, you've solved one problem, but added a big security issue to be managed. Given the number of systems with integrated management these days, I've read suggestions that this can be a very high security risk as they can be easily overlooked for security updated etc - that's IF the manufacturer actually provides them.

Similarly with remote power switching. You solve one problem, but add another point of failure and a security risk.

So you may be better off just arranging with a friend to be able to go in and read the screen/press buttons for you. For us, we need someone to come in and feed the pets if we're away anyway.

Simon

_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

jfabernathy at gmail

Sep 23, 2020, 4:02 AM

Post #8 of 13 (1182 views)

On Tue, Sep 22, 2020 at 12:17 PM Ian Evans <dheianevans@gmail.com> wrote:

>
>>
> Thanks to everyone for the input. It'll be a while until back to my
> system. It's funny how you dont realize how smooth things were running
> until they're not.
>
> Now I'm curious if anyone has ever run a low-powered backup myth that
> kicks in if the main one fails to keep the family off their backs.:-) "Why
> didn't my show record!!!" :-P
>
>
I didn't build it as a backup but it could be. My main backend is on a PC
in a closet with a UPS, RAID 1, so it has very little problems. I built a
cheap backend using a Raspberry Pi 4 4GB and it can easily record 4 shows
at once, maybe more, but I only had a HDHR Quatro. I used a USB3 to SATA
adapter to connect a 1TB SSD to the RPI4. You can use this as a combo
mythtv, but I'm not satisfied with the mythfrontend on my shows.

You can easily set up the RPI4 to record all the critical shows from a
networked tuner like HDHomerun Quatro and just delete them once they have
been viewed on the main backend.

I use mine in my RV Camper when on the road and I use a FireTV 4K as a
frontend because Mythtv leanfront on the FireTV 4K is much better than the
RPI4 used as a frontend.

Jim A

Re: Slightly OT: Powering up remotely [ In reply to ]

dheianevans at gmail

Sep 23, 2020, 1:41 PM

Post #9 of 13 (1174 views)

On Tue, Sep 22, 2020 at 10:34 PM Christopher Sean Hilton <chris@vindaloo.com>
wrote:

> On Tue, Sep 22, 2020 at 12:15:19PM -0400, Ian Evans wrote:
> >
> > ============================================================
> >
> > On Mon, Sep 21, 2020 at 7:02 PM jam@tigger.ws <jam@tigger.ws> wrote:
> >
> >
> >
> > > On 22 Sep 2020, at 2:13 am, Ian Evans <dheianevans@gmail.com>
> wrote:
> > >
> > > 1) Let's say it auto-started but is currently sitting there
> > > with some "press y to continue" drive/os message. To future
> > > proof this, aren't there some motherboards that allow you
> > > to have remote KVM access through the bios?
> >
> > There are grub kernel settings
> > fsck=force fsck=repair
> >
> > The logic being either the disk is trashed and saying y or n
> > during fsck wont help you, or saying y will fix your disk. In
> > any event (no spinning rust) I do noy find fsck obtrusive. Im
> > using EXT4
> >
>
> I'm curious to see what this is when you get back to investigate. In
> my experience it could be something very simple. In my recent
> experience it wasn't but that's another story. The sequence of events
> to restore power for systems held can be more complicated than it
> would seem. Failures in the process could lead to a remote machine
> being in a safe completely off state.
>
> Three things might help you here:
>
> - IPMI
> - A remote switchable AC power unit
> - A Serial Console on your MythTV and a "terminal server"
>
> If you had either IPMI or a Serial console and a remote switchable
> plug for your MythTV server, you would have some options. IPMI is what
> you are talking about. It lets you get into your server before the
> operating system boots up. A remote switchable power plug would let
> you turn the server off and on from a remote location. Assuming the
> BIOS is setup to automatically turn the power on after a power cut,
> that should boot the server. A serial console and a terminal server
> from which to access it would allow you to control the server after
> the operation system boots even if it gets stuck before being able to
> go multi-user. If your server has a serial console which
> you can access remotely, you can use the serial console to answer the
> fsck questions and possibly fix the filesystem and then transition
> back to running.
>
> Without going into great detail, there are plenty of remote switchable
> AC plugs that you can find. So long as you can figure out a way to
> remotely access it from the internet, that would help. Best here would
> be something like an APC AP7900 Switchable PDU. You can get those on
> eBay for between US $50 and $120. I'm assuming that you are in the US
> here. I don't know what they would go for on the other side of the
> pond.
>
> As for serial consoles, they should be possible in Linux. My rough
> guess as to cost is again about $100. This sounds expensive but you
> need both a serial console and another server from which you access
> the console. My $100 estimate:
>
> - puts a PCIe serial card in your MythTV;
>
> - purchases a Raspberry Pi and the other dedicated hardware that you
> need to access the MythTV console.
>
> The basic operation would then be to use ssh into the raspberry
> pi; then use a serial access program like `cu` to jump from the pi's
> ssh command line to the serial console.
>
> In summary, there's no way to know what the right solution is until
> you get back home because right now you don't know enough about how
> the system failed.
>
>
Once again to everyone: Thanks for the input, advice, and food for thought.

Christopher: Yes, the suspense is killing me. The one or two times this
happened over the last nine years it was always some fsck/error type
message for the recording drives (two internal and an external drive I send
some stuff to after recording). It had never been the OS drive. So it's
sorta frustrating that the boot process will hang (and ssh not start)
because of something happening to a data drive. You'd think it would let
you get to the point where you could ssh in and then say "Here's a problem
with your system. I can't continue past this." It'll be a while before I
see it and if it's just a "hit y" wait a few minutes and bingo, I'll yell.

I really like the console idea as I was thinking of doing a project with a
Pi anyway and that could be one of its purposes. Question: Is it something
that you can ssh into, run the console terminal, and see the output so far,
or is it a matter of 1) ssh into the Pi, start the console, 2) use remote
AC switch to restart mythbox 3) see that NEW output? I hope that was a
clear question.

Ian

Re: Slightly OT: Powering up remotely [ In reply to ]

gary.buhrmaster at gmail

Sep 23, 2020, 3:17 PM

Post #10 of 13 (1174 views)

On Wed, Sep 23, 2020 at 8:42 PM Ian Evans <dheianevans@gmail.com> wrote:

> Christopher: Yes, the suspense is killing me.

Again, lots of possibilities. And while this may
be an unpopular statement in this venue, one
must remember that this is only TV, and if you
really need to know what happened on a show
just read the synapsis someone has made.

> The one or two times this happened over the last nine years it was always some fsck/error type message for the recording drives (two internal and an external drive I send some stuff to after recording).

I am presuming you are using a filesystem type
with robust journaling (if not, that is step zero to
correct).

Consider using the nofail mount option (especially
for external disks that may not power up quickly
or at all). You may also choose to set the wait
time so that it does not wait 90 seconds using
something like x-systemd.device-timeout=10s

If you are the type such that you are always
going to just say "y" to the fsck prompts anyway,
understand, and then consider setting, the
various kernel command line options fsck.mode
and fsck.repair appropriately.

Consider changing the MythTV backend startup
unit with a After and Wants for the recording drives
if they are necessary (you may need to add the
appropriate udev stanzas to create the device
units) if you only want MythTV to start after the
drives are mounted (in the case nofail continues
without the drives). In some cases you may be
happy to let it start up with a subset of drives
of course, but choose carefully.

> You'd think it would let you get to the point where you could ssh in

Consider changing the WantedBy for sshd to
earlier in the process (network-online.target?)
rather than multi-user.target. Of course, if
the filesystem missing is something you need
to login (like home) that might not help much.
Be aware you may need to add appropriate
Wants and Afters that are inherited by being
late in the process. And note that network-online
may not be what you think it is.
_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

stephen_agent at jsw

Sep 23, 2020, 11:33 PM

Post #11 of 13 (1174 views)

On Wed, 23 Sep 2020 16:41:54 -0400, you wrote:

>Christopher: Yes, the suspense is killing me. The one or two times this
>happened over the last nine years it was always some fsck/error type
>message for the recording drives (two internal and an external drive I send
>some stuff to after recording). It had never been the OS drive. So it's
>sorta frustrating that the boot process will hang (and ssh not start)
>because of something happening to a data drive. You'd think it would let
>you get to the point where you could ssh in and then say "Here's a problem
>with your system. I can't continue past this." It'll be a while before I
>see it and if it's just a "hit y" wait a few minutes and bingo, I'll yell.

The problem where the system does not boot when fsck happens used to
be common on older versions of Ubuntu using systemd. For a while now
on 18.04 I have not seen it, but I have not been particularly testing
for it. There may have been a systemd update that fixes it, or makes
it less likely. What I suspect is happening is that the fscks take a
long time, which causes other things that systemd has waiting to start
to exceed the timeout after which they get started anyway, even if the
preconditions for them starting have not been met. So they start up
and then fail. In the case of ssh, networking may not be up when it
gets started, or systemd may find that it is unable to start the
normal system (multiuser mode) and sshd may only be started when
multiuser mode is started. Systemd or the configuration of these
systemd units should be fixed to prevent this, and may have been at
least somewhat.

However, if you are relying on the system to do fscks after a crash,
power failure or failed shutdown, then you are living dangerously.
There are two distinct problems with relying on automatic fscks
happening on boot when partitions are marked as dirty.

First, the automatic fsck with the fix option on will only be run
once. When manually running fsck in this situation, I frequently get
problems that are still present after telling fsck to fix everything.
My worst one so far was on my mother's MythTV box where it took 7 runs
of fsck before it ran without finding new errors to fix. So if the
system automatic fsck does fix things, that does not mean the
partition is safe to be written to. To be safe, you have to have fsck
run and not find any errors to fix. The automatic fsck checks at boot
do not do this. And when manually running fsck, there can be times
when it tells you that a file is damaged and can not be repaired - you
have to write down the file names and manually fix them later. Usually
the files are ones that are installed by packages and I can just copy
the same file from one of my other PCs running the same distro.

Secondly, there can be data errors caused by things having been failed
to be written to disk. Fsck can have run without errors, but software
with complex use of data in its files can still have things in an
inconsistent or corrupt state. It depends on each program as to how
it maintains consistency and an error free state. The place I meet
this all the time is the mythconverg database. If mythbackend is
recording at the time of a power failure or crash, the recordedseek
table is usually left in a corrupt state, as it is written to all the
time during a recording. If you are unlucky, other database tables
may have also been being written to at the time of the crash and can
also be left in a corrupt state. If you make sure to check and fix
all the database tables before running mythbackend again, there is
normally no problem as MySQL/MariaDB will be able to fix the tables.
However, if you run mythbackend without fixing the tables first, you
can cause writes to the database that will go wrong due to the
existing corruption and this can then cause further corruption that
makes the table unfixable. With the recordedseek table, this is not
so bad as mythbackend can re-create it completely if told to do so (it
takes a long, long time if you have lots of recordings). But with
most of the other tables, you will have no option but to restore a
backup copy of the database to fix this. And if you did not notice
the problem for a while, you can find that all your existing backups
of the database are also corrupt and you can lose your entire
database. I keep both daily and weekly backups to help with this
problem.

So I recommend that MythTV users never, ever rely on the automatic
fscks done at boot to recover from a crash situation like this. The
right way to recover is to manually boot to a safe state where nothing
is running except the system in read only mode and manually fix
things. Using the recovery options on the grub menu to boot to a root
prompt might seem like the way to do this, but unfortunately that does
not allow you to run fsck on the boot partition. So there are two
ways I do it. One is to have a second boot partition on the same PC
that has the same or later version of the operating system on it. It
is set up to boot only into its system and not mount any other
partitions. So from the grub menu I can boot that partition, run apt
to update it to the same versions of packages as the main boot
partition, then use it to run fscks on all the other partitions,
including the normal boot partition. I run fsck on all partitions and
on any that need fixes I re-run it until it says there are no more
errors.

I normally write myself a set of fsck repair scripts to do that so I
can just run them to get all the partitions being fscked in parallel.
There is one script (chk_all.sh) that runs all the other scripts (one
per drive) in a tabbed window:

root@mypvr:/mnt/ssd1/usr/local/bin# cat chk_all.sh
#!/bin/bash

# Check all partitions in parallel.

xfce4-terminal -H --title=rec1 --command chk_rec1.sh \
--tab -H --title=rec2 --command "bash chk_rec2.sh" \
--tab -H --title=rec3 --command "bash chk_rec3.sh" \
--tab -H --title=rec4 --command "bash chk_rec4.sh" \
--tab -H --title=rec5 --command "bash chk_rec5.sh" \
--tab -H --title=rec6 --command "bash chk_rec6.sh" \
--tab -H --title=rec7 --command "bash chk_rec7.sh" \
--tab -H --title=stardom --command "bash chk_stardom.sh" \
--tab -H --title=ssd --command "bash chk_ssd.sh"

and each of the scripts does fsck on the partitions on one drive. For
example:

root@mypvr:/mnt/ssd1/usr/local/bin# cat chk_rec2.sh
#!/bin/bash

echo Checking rec2
fsck -C -f /dev/disk/by-label/rec2
echo Checking rec2boot
fsck -C -f /dev/disk/by-label/rec2boot
echo Finished

The chk_stardom.sh script checks all four drives I have in my Stardom
eSATA drive mount. They are all on the one eSATA connection and can
not be checked in parallel with each other:

root@mypvr:/mnt/ssd1/usr/local/bin# cat chk_stardom.sh
#!/bin/bash

echo Checking vid1
fsck -C -f /dev/disk/by-label/vid1
echo Checking vid1boot
fsck -C -f /dev/disk/by-label/vid1boot
echo Checking vid2
fsck -C -f /dev/disk/by-label/vid2
echo Checking vid3
fsck -C -f /dev/disk/by-label/vid3
echo Checking vid4
fsck -C -f /dev/disk/by-label/vid4
echo Finished

After the fsck checks are all done I manually re-run fsck on any
partitions that needed repairs, and keep doing that until fsck reports
no problems.

Then, before rebooting, I change the normal boot partition to disable
mythbackend from starting by removing the link in
/etc/systemd/system/multi-user.target.wants that points to
mythbackend:

root@mypvr:/etc/systemd/system# ll
multi-user.target.wants/mythtv-backend.service
lrwxrwxrwx 1 root root 42 Jul 2 22:26
multi-user.target.wants/mythtv-backend.service ->
/lib/systemd/system/mythtv-backend.service

so my commands to do the rm are:

mount /dev/disk/by-label/ssd2 /mnt/ssd2
rm
/mnt/ssd2/etc/systemd/system/multi-user.target.wants/mythtv-backend.service

Those commands are normally in my command history so I re-use them
rather than typing them again each time.

Removing that link does exactly what running this command does when
run from the normal boot:

sudo systemctl disable mythtv-backend

so mythbackend will not be started automatically. Then I reboot into
the normal boot partition. Immediately after rebooting, you may find
that anacron is running. This happens if the reboot is done after
midnight and before the normal time that anacron gets run at every
day. If so, you want to kill anacron as it will want to run its
normal daily (and weekly or monthly) jobs, some of which are for
database backup and repair, and many of which will make the PC very
busy and slow to do your manual repairs. So to prevent that:

sudo su
pkill anacron

This has to be done immediately after rebooting - anacron waits a few
minutes after it is started before it does anything, so you need to
kill it before it starts its jobs.

It is also possible to have anacron permanently set up to only be run
on its daily timer, rather than also at boot. To do that, run this
command:

sudo systemctl disable anacron

After that, anacron will still be run from the systemd anacron.timer
unit, but not from the anacron.service unit. I recommend doing this
for MythTV boxes that are normally left running 24/7.

Then I do this:

cd /etc/cron.daily
./optimize_mythdb

I have my optimize_mythdb and mythtv-database commands in cron.daily,
rather than cron.weekly, so I get daily database checks and backups.
This is also highly recommended.

Normally optimize_mythdb will find that the recordedseek table needs
fixing and will repair it. Occasionally, other tables will also need
repairs. If I am very unlucky, optimize_mythdb will report it was
unable to repair recordedseek. Usually I can then use manual repair
commands to repair it. It is a number of years now since I have had
to restore it from backup. Then I can do this:

systemctl enable mythtv-backend
systemctl start mythtv-backend
exit

and everything is working again.

Once mythbackend is running, I may need to run "mythcommflag
--rebuild" on all the recordings that had their recordedseek entries
affected by the corruption of that table. I have a user job set up
that runs that command, so I can just see what recordings were made
since the last database backup and run mythcommflag on each of them
from mythfrontend. If you also do commercial skip flagging, that
needs to also be redone by a second mythcommflag command after the
--rebuild one.

After that, everything is back to normal until the next time it
happens.

The other way to do repairs like this is to boot a DVD, USB or PXE
live version of the system rather than having an extra bootable
partition. In that case, you may find you have to install packages to
get the tools you need to do the repairs. If so, you need to have
your network set up so that live boots can have Internet access. I
find that I normally will need to install the jfsutils package to get
the fsck module for my JFS partitions.
_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org

Re: Slightly OT: Powering up remotely [ In reply to ]

dheianevans at gmail

Sep 26, 2020, 7:51 AM

Post #12 of 13 (1173 views)

On Thu, Sep 24, 2020 at 2:34 AM Stephen Worthington <
stephen_agent@jsw.gen.nz> wrote:

> On Wed, 23 Sep 2020 16:41:54 -0400, you wrote:
>
> >Christopher: Yes, the suspense is killing me. The one or two times this
> >happened over the last nine years it was always some fsck/error type
> >message for the recording drives (two internal and an external drive I
> send
> >some stuff to after recording). It had never been the OS drive. So it's
> >sorta frustrating that the boot process will hang (and ssh not start)
> >because of something happening to a data drive. You'd think it would let
> >you get to the point where you could ssh in and then say "Here's a problem
> >with your system. I can't continue past this." It'll be a while before I
> >see it and if it's just a "hit y" wait a few minutes and bingo, I'll yell.
>
> The problem where the system does not boot when fsck happens used to
> be common on older versions of Ubuntu using systemd. For a while now
> on 18.04 I have not seen it, but I have not been particularly testing
> for it. There may have been a systemd update that fixes it, or makes
> it less likely. What I suspect is happening is that the fscks take a
> long time, which causes other things that systemd has waiting to start
> to exceed the timeout after which they get started anyway, even if the
> preconditions for them starting have not been met. So they start up
> and then fail. In the case of ssh, networking may not be up when it
> gets started, or systemd may find that it is unable to start the
> normal system (multiuser mode) and sshd may only be started when
> multiuser mode is started. Systemd or the configuration of these
> systemd units should be fixed to prevent this, and may have been at
> least somewhat.
> <snip>

Thanks for the treasure trove of suggestions. I'm beginning to wonder
what's more frustrating, the fact the system is down or that I can't try
everyone's suggestions right now. Stephen, your treasure trove of
suggestions is getting Evernote'd.

Thanks.

Re: Slightly OT: Powering up remotely [ In reply to ]

chris at vindaloo

Oct 9, 2020, 4:41 PM

Post #13 of 13 (1146 views)

On Wed, Sep 23, 2020 at 04:41:54PM -0400, Ian Evans wrote:
>
> ============================================================
>
>
>
> On Tue, Sep 22, 2020 at 10:34 PM Christopher Sean Hilton <
> chris@vindaloo.com> wrote:
>
> On Tue, Sep 22, 2020 at 12:15:19PM -0400, Ian Evans wrote:
> >
> > ============================================================
> [ ...snip... ]

> the system failed.
>
>
>
> Once again to everyone: Thanks for the input, advice, and food for
> thought.
>
> Christopher: Yes, the suspense is killing me. The one or two times this
> happened over the last nine years it was always some fsck/error type
> message for the recording drives (two internal and an external drive I
> send some stuff to after recording). It had never been the OS drive. So
> it's sorta frustrating that the boot process will hang (and ssh not
> start) because of something happening to a data drive. You'd think it
> would let you get to the point where you could ssh in and then say
> "Here's a problem with your system. I can't continue past this." It'll
> be a while before I see it and if it's just a "hit y" wait a few
> minutes and bingo, I'll yell.
>
> I really like the console idea as I was thinking of doing a project
> with a Pi anyway and that could be one of its purposes. Question: Is it
> something that you can ssh into, run the console terminal, and see the
> output so far, or is it a matter of 1) ssh into the Pi, start the
> console, 2) use remote AC switch to restart mythbox 3) see that NEW
> output? I hope that was a clear question.
>

Sorry about the long delayed reply.

The way you described it is the normal scenario. You connect to the
pi, then connect to the serial console, finally you bounce the power
remotely and you get the boot messages that would normally appear on
your console remotely.

--
Chris

__o "All I was trying to do was get home from work."
_`\<,_ -Rosa Parks
___(*)/_(*)____.___o____..___..o...________ooO..._____________________
Christopher Sean Hilton [chris/at/vindaloo/dot/com]
_______________________________________________
mythtv-users mailing list
mythtv-users@mythtv.org
http://lists.mythtv.org/mailman/listinfo/mythtv-users
http://wiki.mythtv.org/Mailing_List_etiquette
MythTV Forums: https://forum.mythtv.org