Search This Blog

Tuesday, May 21, 2019

How to Boot into Single User Mode in CentOS/RHEL 7

DISCLAIMER: This is not my post is only a copy, in case the original gets deleted or whatever, posting on my personal blog gets easier for me to find it. You can find the original one at this link https://vpsie.com/knowledge-base/boot-single-user-mode-centos-rhel-vpsie/

The first thing to do is to open Terminal and log in to your CentOS 7 server.

After, restarting your server wait for the GRUB boot menu to show.

The next step is to select your Kernel version and press

e

key to edit the first boot option. Find the kernel line (starts with “linux16“), then change the

ro

to

rw init=/sysroot/bin/sh .

When you have finished, press

Ctrl-X

or

F10

to boot into single-user mode

After mounting the root filesystem using the following command:

chroot /sysroot/

Now, to finish this process reboot your server using the following command:

reboot -f

Wednesday, April 17, 2019

XFS online resize

 You're working on an XFS filesystem, in this case, you need to use xfs_growfs instead of resize2fs. Two commands are needed to perform this task :

# growpart /dev/sda 1

growpart is used to expand the sda1 partition to the whole sda disk.

# xfs_growfs -d /dev/sda1

xfs_growfs is used to resize and apply the changes.

# df -h

Friday, April 5, 2019

Convert string <-> int64 using golang #go-nuts

I believe that if you are going to work with timestamps is better to do it in epoch stamps, so in GO epoch is type int64.

package main

import (
    "fmt"
    "time"
    "strconv"
)

func main() {

    now := time.Now()
    nanos := now.UnixNano()
    bufferTimestamp := strconv.FormatInt(nanos, 10)

    fmt.Printf("bufferTimestamp value: %s\n", bufferTimestamp)
    timestamp, err := strconv.ParseInt(string(bufferTimestamp), 10, 64)
    if err != nil {
        fmt.Printf("Error: %d of type %T\n", timestamp, timestamp)
        panic(err)
    } else {
        fmt.Printf("Converted value: %d\n", timestamp)
    }
}

By running this you will have an output like this.

$ go run test/convert_stringtoint64.go 
bufferTimestamp value 1556951794912716618 of type string
Converted value 1556951794912716618 of type int64

Wednesday, December 12, 2018

How to disable Cloud-Init in a EL-like Cloud Image

So this one is pretty simple. However, I found a lot of misinformation along the way, so I figured that I would jot the proper (and most simple) process here.

Symptoms: an RHEL (or variant) VM that takes a very long time to boot. On the VM console, you can see the following output while the VM boot process is stalled and waiting for a timeout. Note that the message below has nothing to do with cloud-init, but it's the output that I have most often seen on the console while waiting for a VM to boot.

[106.325574} random: crng init done

Note that I have run into this issue in both OpenStack (when booting from external provider networks) and in KVM.

Upon initial boot of the VM, run the command below.

13:18:01 alvaro@lykan /home/alvaro/Documents/2post
$ sudo dnf install libguestfs libguestfs-tools openssl
Last metadata expiration check: 1:53:31 ago on Mon 16 Jul 2018 01:51:05 PM CDT.
Package libguestfs-1:1.38.2-1.fc27.x86_64 is already installed, skipping.
Package libguestfs-tools-1:1.38.2-1.fc27.noarch is already installed, skipping.
Package openssl-1:1.1.0h-3.fc27.x86_64 is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!

13:18:26 alvaro@lykan /home/alvaro/Documents/2post
$ guestfish --rw -a ../../Downloads/CentOS-7-x86_64-GenericCloud-1805.qcow2
Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
‘man’ to read the manual
‘quit’ to quit the shell

> run
100% ⟦▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒⟧ 00:00
> list-filesystems
/dev/sda1: xfs
> mount /dev/sda1 /
> touch /etc/cloud/cloud-init.disabled
> quit

Seriously, that’s it. No need to disable or remove cloud-init services.

Monday, July 16, 2018

Change password to users on qcow2 disk or images

Sometimes you need to change the password to a user in a qcow2 image, to test locally, or if you are using an infrastructure without cloud-init, regardless of the user the procedure is the same.

Depending on the system the packages name could change a little, I'm using Fedora 27 I have installed

[alvaro@lykan 2post]$ sudo dnf install libguestfs libguestfs-tools openssl
Last metadata expiration check: 1:53:31 ago on Mon 16 Jul 2018 01:51:05 PM CDT.
Package libguestfs-1:1.38.2-1.fc27.x86_64 is already installed, skipping.
Package libguestfs-tools-1:1.38.2-1.fc27.noarch is already installed, skipping.
Package openssl-1:1.1.0h-3.fc27.x86_64 is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!


Obviously, I have a QEMU environment to test and run the images, a very important part just to know that your steps are working.

[alvaro@lykan 2post]$ guestfish --rw -a ../../Downloads/CentOS-7-x86_64-GenericCloud-1805.qcow2

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
‘man’ to read the manual
‘quit’ to quit the shell

><.fs> run
100% ⟦▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒⟧ 00:00
><.fs> list-filesystems
/dev/sda1: xfs
><.fs> mount /dev/sda1 /
><.fs> cp /etc/shadow /etc/shadow-original
><.fs> vi /etc/shadow


Inside the vim editor, you will see the file and now you can change the hash of any user (do not close this until you reached the last step), in any other terminal run:

[alvaro@lykan 2post]$ openssl passwd -1 mysuperpassword
$1$GKdzYMMe$q20PpMv5i/QFbmgwOqtZy1


Copy that generated hash and copy inside the first and second colon punctuation symbol (delete every inside this)


Before

root:!!:17687:0:99999:7:::
bin:*:17632:0:99999:7:::
daemon:*:17632:0:99999:7:::
adm:*:17632:0:99999:7:::
lp:*:17632:0:99999:7:::
sync:*:17632:0:99999:7:::
shutdown:*:17632:0:99999:7:::
halt:*:17632:0:99999:7:::
mail:*:17632:0:99999:7:::
operator:*:17632:0:99999:7:::
games:*:17632:0:99999:7:::
ftp:*:17632:0:99999:7:::
nobody:*:17632:0:99999:7:::
systemd-network:!!:17687::::::
dbus:!!:17687::::::
polkitd:!!:17687::::::
rpc:!!:17687:0:99999:7:::
rpcuser:!!:17687::::::
nfsnobody:!!:17687::::::
sshd:!!:17687::::::
postfix:!!:17687::::::
chrony:!!:17687::::::


After

root:$1$GKdzYMMe$q20PpMv5i/QFbmgwOqtZy1:17687:0:99999:7:::
bin:*:17632:0:99999:7:::
daemon:*:17632:0:99999:7:::
adm:*:17632:0:99999:7:::
lp:*:17632:0:99999:7:::
sync:*:17632:0:99999:7:::
shutdown:*:17632:0:99999:7:::
halt:*:17632:0:99999:7:::
mail:*:17632:0:99999:7:::
operator:*:17632:0:99999:7:::
games:*:17632:0:99999:7:::
ftp:*:17632:0:99999:7:::
nobody:*:17632:0:99999:7:::
systemd-network:!!:17687::::::
dbus:!!:17687::::::
polkitd:!!:17687::::::
rpc:!!:17687:0:99999:7:::
rpcuser:!!:17687::::::
nfsnobody:!!:17687::::::
sshd:!!:17687::::::
postfix:!!:17687::::::
chrony:!!:17687::::::


Close the vim editor, save the changes, and exit guestfish

><.fs> quit

[alvaro@lykan 2post]$


Now you can test the image on any cloud environment or using your local QEMU environment.

Wednesday, December 6, 2017

Get total provisioned size from cinder volumes

A quick way to get the total amount of provisioned space from cinder

alvaro@skyline.local: ~
$ cinder list --all-tenants
mysql like output :)

So to parse the output and add all the values in the Size col, use the next piped commands.

alvaro@skyline.local: ~
$ . admin-openrc.sh

alvaro@skyline.local: ~
$ cinder list --all-tenants | awk -F'|' '{print $6}' | sed 's/ //g' | grep -v -e '^$' | awk '{s+=$1} END {printf "%.0f", s}'
13453

The final result is in GB.

Wednesday, June 14, 2017

Ceph recovery backfilling affecting production instances

In any kind of distributed system, you will have to choose between consistency, availability, and partitioning, the CAP theorem states that in the presence of a network partition, one has to choose between consistency and availability, by default (default configurations) CEPH provides consistency and partitioning, just take in count that CEPH has many config options: ~860 in hammer, ~1100 in jewel, check this out, is jewel github config_opts.h file.

If you want any specific behavior in your cluster depends on your ability to configure and/or to change on the fly in case of contingency, this post talks about specific default recovery / backfilling option clusters, maybe you have noticed that in case of a critical failure, like losing a complete node, this causes a lot of movement of data, lots of ops on the drives, by default the cluster is going to try to recover in the fastest way possible, and also needs to support the normal operation and common use, like I said at the beginning of the post, by default CEPH have consistency and partitioning, so the common response is to start to have failures in the availability and users will start to notice high latency, high CPU usage in instances using RBD backend because of the slow response.





Try to think of this in a better way and let's analyze the problem, if we have a replica 3 cluster and we have a server down (even if we have a 3 servers cluster), the operation is still possible and the recovery jobs are no that important because CEPH will try to achieve consistency all the time, it will achieve the correct 3 replica consistency eventually, so everything will be fine, no data loss, the remaining replicas will start to regenerate the missing replica in others nodes, the big problem is the backfilling will compromise the operation, so the real problem is that we need to choose between a quick recovery or a common response to the clients and watchers connected, the response is not that hard to know, operation response is priority number 0!!!!





Lost and recovery action in CRUSH (Image from Samuel Just, Vault 2015)

This is not the non-plus ultra solution, is just my solution to this problem, all this was tested in a CEPH hammer cluster:

1.- The better one is to configure at the beginning of the installation in the ceph.conf file

******* SNIP *******
[osd]
....
osd max backfills = 1
osd recovery threads = 1
osd recovery op priority = 1
osd client op priority = 63
osd recovery max active = 1
osd snap trim sleep = 0.1
....
******* SNIP *******

2.- If not, you can inject the on-the-fly options, you can use osd.x where x is the number of the osd daemon, or like the next example applies cluster-wide, but remember to put in the config file after because these options will be lost on reboot.

ceph@stor01:~$ sudo ceph tell osd.* injectargs '--osd-max-backfills 1'
ceph@stor01:~$ sudo ceph tell osd.* injectargs '--osd-recovery-threads 1'
ceph@stor01:~$ sudo ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
ceph@stor01:~$ sudo ceph tell osd.* injectargs '--osd-client-op-priority 63'
ceph@stor01:~$ sudo ceph tell osd.* injectargs '--osd-recovery-max-active 1'
ceph@stor01:~$ sudo ceph tell osd.* injectargs '--osd-snap-trim-sleep 0.1'

The final result will be a really slow recovery of the cluster, but operation without any kind of problem.