вторник, 11 ноября 2014 г.

Замена жесткого диска в программном RAID1 в операционной системе Linux

Источник, спасибо автору поста.

Исходные данные
 Имеем два жестких диска: /dev/sda и /dev/sdb. Из них созданы четыре программных RAID-массива:
  • /dev/md0 - swap
  • /dev/md1 - /boot
  • /dev/md2 - /
  • /dev/md3 - /data

 Для получения информации о состоянии массивов выполняем:
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sda4[0] sdb4[1]
     1822442815 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sda3[0] sdb3[1]
     1073740664 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
     524276 blocks super 1.2 [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
     33553336 blocks super 1.2 [2/2] [UU]

unused devices: 

О том, что массивы в порядке, указывает наличии двух букв U в квадратных кавычках каждого массива - [UU]. Если массив поврежден, буква U меняется на _. Для данного примера:
  • [_U] - поврежден /dev/sda
  • [U_] - поврежден /dev/sdb

# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sda4[0] sdb4[1](F)
      1822442815 blocks super 1.2 [2/1] [U_]

md2 : active raid1 sda3[0] sdb3[1](F)
      1073740664 blocks super 1.2 [2/1] [U_]

md1 : active raid1 sda2[0] sdb2[1](F)
      524276 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0] sdb1[1](F)
      33553336 blocks super 1.2 [2/1] [U_]

unused devices: 

Массивы не сихронизированы и виновен в этом сбойный диск /dev/sdb, будем его менять.

Удаление поврежденного жесткого диска
 Перед установкой нового жесткого диска необходимо удалить из массива поврежденный диск. Для этого выполняем следующую последовательность команд:
# mdadm /dev/md0 -r /dev/sdb1
# mdadm /dev/md1 -r /dev/sdb2
# mdadm /dev/md2 -r /dev/sdb3
# mdadm /dev/md3 -r /dev/sdb4

Бывают ситуации, когда не все программные RAID-массивы повреждены:
# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sda4[0] sdb4[1](F)
      1822442815 blocks super 1.2 [2/1] [U_]

md2 : active raid1 sda3[0] sdb3[1](F)
      1073740664 blocks super 1.2 [2/1] [U_]

md1 : active raid1 sda2[0] sdb2[1](F)
      524276 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0] sdb1[1]
      33553336 blocks super 1.2 [2/1] [UU]

unused devices: 

В таком случае не удастся удалить рабочий раздел из массива. Необходимо сначала пометить его как сбойный, а только потом удалять:
# mdadm /dev/md0 -f /dev/sdb1
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sda4[0] sdb4[1](F)
      1822442815 blocks super 1.2 [2/1] [U_]

md2 : active raid1 sda3[0] sdb3[1](F)
      1073740664 blocks super 1.2 [2/1] [U_]

md1 : active raid1 sda2[0] sdb2[1](F)
      524276 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0] sdb1[1](F)
      33553336 blocks super 1.2 [2/1] [U_]

unused devices:
# mdadm /dev/md0 -r /dev/sdb1
# mdadm /dev/md1 -r /dev/sdb2
# mdadm /dev/md2 -r /dev/sdb3
# mdadm /dev/md3 -r /dev/sdb4

Подготовка нового жесткого диска
Оба диска в массиве должны иметь абсолютно одинаковое разбиение. В зависимости от используемого типа таблицы разделов (MBR или GPT) необходимо использовать соответствующие утилиты для копирования таблицы разделов.
Для жесткого диска с MBR используем утилиту sfdisk:
#sfdisk -d /dev/sda | sfdisk --force /dev/sdb

где /dev/sda - диск источник, /dev/sdb - диск назначения.
Для жесткого диска с GPT используем утилиту sgdisk из GPT fdisk:
#sgdisk -R /dev/sdb /dev/sda
#sgdisk -G /dev/sdb

где /dev/sda - диск источник, /dev/sdb - диск назначения. Вторая строка назначает новому жесткому диску случайный UUID.

Добавление нового жесткого диска
Осталось добавить новый, размеченный жесткий диск в массивы и установить на нем загрузчик:
# mdadm /dev/md0 -a /dev/sdb1
# mdadm /dev/md1 -a /dev/sdb2
# mdadm /dev/md2 -a /dev/sdb3
# mdadm /dev/md3 -a /dev/sdb4

После этого начнется процесс синхронизации. Время синхронизации зависит от объема жесткого диска:
# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sdb4[1] sda4[0]
     1028096 blocks [2/2] [UU]
     [==========>..........]  resync =  50.0% (514048/1028096) finish=97.3min speed=65787K/sec

md2 : active raid1 sdb3[1] sda3[0]
     208768 blocks [2/2] [UU]

md1 : active raid1 sdb2[1] sda2[0]
     2104448 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
     208768 blocks [2/2] [UU]

unused devices: 

Если в системе используется загрузчик GRUB2 достаточно выполнить следующие команды (при этом нет необходимости дожидаться окончания процесса синхронизации):
#grub-install /dev/sdb
#update-grub 

После окончания синхронизации можете вздохнуть спокойно - ваши данные опять в безопасности.

понедельник, 27 октября 2014 г.

Mysql: обновление timezone

Источник, как всегда авторам благодарность.

Чтобы проверить текущую временную зону, нужно выполнить команду:

SHOW VARIABLES LIKE '%zone%';
SELECT @@global.time_zone, @@session.time_zone;

Чтобы посмотреть текущее время сервера MySQL:



select current_timestamp();

Прописать в конфигурационном файле timezone можно следующим способом (в таком случае потребуется перезагрузка):

/etc/my.cnf
default-time-zone = "Europe/Moscow"

Можно поменять время и без перезагрузки, для этого сначала перенесем системные тайм зоны в MySQL следующим способом:

mysql_tzinfo_to_sql /usr/share/zoneinfo |mysql -u root mysql -p

Далее, мы уже можем обновить временную зону без появления ошибок типа:

ERROR 1298 (HY000): Unknown or incorrect time zone:

Выполним обновление time_zone:

SET GLOBAL time_zone = 'Europe/Moscow';
SET time_zone = 'Europe/Moscow';

В MySQL также можно использовать системное время, это наверное даже лучше. Чтобы изменить текущее системное время на сервере, нужно сделать:

cp /usr/share/zoneinfo/Europe/Moscow /etc/localtime
Чтобы использовалось системное время, в MySQL нужно выполнить

SET GLOBAL time_zone = 'SYSTEM';
SET time_zone = 'SYSTEM';

пятница, 24 октября 2014 г.

Overriding the default Linux kernel 20-second TCP socket connect timeout

Source, thanks to author.

Whatever language or client library you're using, you should be able to set the timeout on network socket operations, typically split into a connect timeout, read timeout, and write timeout.
However, although you should be able to make these timeouts as small as you want, the connect timeout in particular has an effective maximum value for any given kernel. Beyond this point, higher timeout values you might request will have no effect - connecting will still time out after a shorter time.
The reason TCP connects are special is that the establishment of a TCP connection has a special sequence of packets starting with a SYN packet. If no response is received to this initial SYN packet, the kernel needs to retry, which it may have to do a couple of times. All kernels I know of wait an increasing amount of time between sending SYN retries, to avoid flooding slow hosts.
All kernels put an upper limit on the number of times they will retry SYNs. On BSD-derived kernels, including Mac OS X, the standard pattern is that the second SYN will be second 6 seconds after the first, then a third SYN 18 seconds after that, then the connect times out after a total of around 75 seconds.
On Linux however, the default retry cycle ends after just 20 seconds. Linux does send SYN retries somewhat faster than BSD-derived kernels - Linux supposedly sends 5 SYNs in this 20 seconds, but this includes the original packet (the retries are after 3s, 6s, 12s, 24s).
The end result though is that if your application wants a connect timeout shorter than 20s, no problem, but if your application wants a connect timeout longer than 20s, you'll find that the default kernel configuration will effectively chop it back to 20s.
Changing this upper timeout limit is easy, though it requires you to change a system configuration parameter and so you will need to have root access to the box (or get the system administrators to agree to change it for you).
The relevant sysctl is tcp_syn_retries, which for IP v4 is net.ipv4.tcp_syn_retries.
Be conservative in choosing the value you change it to. Like BSD, the SYN retry delays increase in time (albeit doubling rather than tripling), so a relatively small increase in the number of retries leads to a large increase in the maximum connect timeout. In a perfect world, there would be no problem with having a very high timeout because applications' connect timeouts will come into play.
However, many applications do not set an explicit connect timeout, and so if you set the kernel to 10 minutes, you're probably going to find something hanging for ages sooner or later when a remote host goes down!
I recommend that you set it to a value of 6, 7, or at most 8. 6 gives an effective connect timeout ceiling of around 45 seconds, 7 gives around 90 seconds, and 8 gives around 190 seconds.
To change this in a running kernel, you can use the /proc interface:
# cat /proc/sys/net/ipv4/tcp_syn_retries 
5
# echo 6 > /proc/sys/net/ipv4/tcp_syn_retries 
Or use the sysctl command:
# sysctl net.ipv4.tcp_syn_retries
net.ipv4.tcp_syn_retries = 5
# sysctl -w net.ipv4.tcp_syn_retries=6
net.ipv4.tcp_syn_retries = 6
To make this value stick across reboots however you need to add it to /etc/sysctl.conf:
net.ipv4.tcp_syn_retries = 6
Most Linux installations support reading sysctls from files in /etc/sysctl.d, which is usually better practice as it makes it easier to administer upgrades, so I suggest you put it in a file there instead.
(I see no reason you'd want to reduce this sysctl, but note that values of 4 or less all seem to be treated as 4 - total timeout 9s.)

четверг, 23 октября 2014 г.

How to set time zone in VPS on node (OpenVZ).

Source, thanks to author
Solution ::-

Please follows the below steps to set the time zone for a particular node in VPS.

1) Login to the main server node via ssh.

2) Stop the node(container) which you want to set time.
------------------
# vzctl stop 1000 >>>>> 1000 = Container ID
------------------

3) Set the container to have capability to change the time zone.
------------------
# vzctl set 1000 --capability sys_time:on --save
------------------

4) Start the container and login to it.
------------------
# vzctl start 1000
# vzctl enter 1000
------------------

5) Change your local timezone with below process.
------------------
# mv /etc/localtime /etc/localtime_bk
# ln -s /usr/share/zoneinfo/America/Chicago /etc/localtime
------------------

6) Set the date and time
------------------
# date 051717302013
time has been set to 17:30 on 17th may 2013
(05-Month, 17-Day, 5-Hours, 31-Minutes, 2013 -Year
------------------

среда, 13 августа 2014 г.

ZoneMinder 1.27, CentOS 6.5, война за работоспособность

1) Установка CentOS 6.5, обновление, отключение SELinux, iptables и прочего не нужного.
2) Подключение sourceforge, epel, rpmfusion репозитариев.
3) Увеличение kernel.shmax до 256МБ (на время тестирования, возможно понадобится еще больше).
4) Установка пакетов ПО:
yum install gcc gcc-c++ wget mysql-devel mysql-server php php-mysql php-pear php-pear-DB php-mbstring bison bison-devel httpd make ncurses ncurses-devel libtermcap-devel sox newt-devel libxml2-devel libtiff-devel php-gd audiofile-devel gtk2-devel libv4l-devel ffmpeg ffmpeg-devel zlib zlib-devel openssl openssl-devel gnutls-devel php-process perl-Time-HiRes perl-CPAN pcre-devel libjpeg-devel perl-Date-Manip perl-libwww-perl perl-Module-Load perl-Net-SFTP-Foreign perl-Archive-Tar perl-Archive-Zip perl-Expect perl-MIME-Lite perl-Device-SerialPort perl-Sys-Mmap perl-MIME-tools bzip2-devel phpMyAdmin zip
4.1) Костыли из-за ffmpeg, установленного из fusion:
 ln -s /usr/include/ffmpeg/libavcodec /usr/include/libavcodec
 ln -s /usr/include/ffmpeg/libavdevice /usr/include/libavdevice
 ln -s /usr/include/ffmpeg/libavfilter /usr/include/libavfilter
 ln -s /usr/include/ffmpeg/libavformat /usr/include/libavformat
 ln -s /usr/include/ffmpeg/libavutil /usr/include/libavutil
 ln -s /usr/include/ffmpeg/libpostproc /usr/include/libpostproc
 ln -s /usr/include/ffmpeg/libswresample /usr/include/libswresample
 ln -s /usr/include/ffmpeg/libswscale /usr/include/libswscale

5) Скачать архив с исходными текстами ПО, распаковать, куда будет удобно, зайти в каталог с распакованными исходниками:
bootstrap.sh

CXXFLAGS=-D__STDC_CONSTANT_MACROS ./configure --with-webdir=/var/www/html/zm --with-cgidir=/var/www/cgi-bin --with-webuser=apache --with-webgroup=apache ZM_DB_HOST=localhost ZM_DB_NAME=zm ZM_DB_USER=YOURZMUSER ZM_DB_PASS=YOURZMPASSWORD ZM_SSL_LIB=openssl --with-extralibs="-L/usr/lib64 -L/usr/lib64/mysql -L/usr/local/lib" --with-libarch=lib64 --with-ffmpeg --enable-mmap=yes

make
service mysqld start
mysql_secure_installation
mysql -u root -p

create database zm;
CREATE USER 'YOURZMUSER'@'localhost' IDENTIFIED BY 'YOURZMPASSWORD';
grant CREATE, INSERT, SELECT, DELETE, UPDATE on zm.* to YOURZMUSER@localhost;
FLUSH PRIVILEGES;
exit

make install

chkconfig mysqld on
chkconfig httpd on

mysql -u root -p zm < ./db/zm_create.sql

cp ./scripts/zm /etc/init.d/
chmod +x /etc/init.d/zm
chkconfig zm on

cd /var/www/html/zm
wget http://www.zoneminder.com/sites/zoneminder.com/downloads/cambozola.jar
chown apache:apache /var/www/html/zm/cambozola.jar

nano /etc/php.ini
short_open_tag = On

service httpd restart
service zm start
 6) Проверить доступность web-интерфейса zm.




вторник, 27 мая 2014 г.

rpc.nfsd: unable to set any sockets for nfsd

Источник 
 
Error: When you install and restart NFS service you will get an error “rpc.nfsd unable to set any sockets for nfsd”
Root cause: This problem may occur due to rpcbind service is not working or the app is not been installed on the server.
Solution:
1. Check the rpcbind service works perfectly by:- #rpcbinfo -p
2. If it throws an error by “rpcinfo: can’t contact portmapper: RPC: Remote system error – No such file or directory”
3. Then we need to install the rpc package by
3a. # yum install avahi
Installing : libdaemon-0.14-1.fc13.i686
Installing : avahi-0.6.27-1.fc14.i686
4. Then restart rpcbind service and check # rpcinfo -p status
5. Then restart nfs service.
6. Check the NFS daemon are avaiable in all run levels by #chkconfig nfs –list
7. Now restart the NFS service where the service starts successfully.

IRQ affinity in Linux

Ссылка на официальную документацию для CentOS
Источник репоста

Some hardware components like ethernet cards,disk controllers,etc. produces interrupts when needs to get attention from cpu. For example, when ethernet card receives packet from network. You can examine your machine's interrupts usage on cpu's by looking in /proc/interrupts proc entry.

# cat /proc/interrupts


This information includes which devices are working on which irq and how many interrupts processed by each cpu for this device.In normal cases, you will not face with a problem and no need to change irq handling process for any cpu.But in some cases, for example if you are running a linux box as firewall which has high incoming or outgoing traffic, this can cause problems. Suppose that you have 2 ethernet cards on your firewall and both ethernet cards handling many packets. In some cases you can see high cpu usage on one of your cpus.This can be caused by many interrupts produced by your network cards. You can check this by looking in /proc/interrupts and see if that cpu is handling interrupts of both cards. If this is the case, what you can do is looking for most idle cpus on your system and specify those ethernet card irqs to be served by each cpu seperately. But beware that you can only do this in a system with IO-APIC enabled device drivers. You can check if your device supports IO-APIC by looking /proc/interrupts.

In ethernet card case, there is an implementation called NAPI which can reduce interrupt usage on incoming network traffic.

You can see which irq is served by which cpu or cpus by looking in /proc/irq directory. Directory layout is very simple. For every used irq in system it is presented by a directory by it's irq number.Every directory contains a file called smp_affinity where you can set cpu settings. File content shows currently which cpu is serving this irq. The value is in hex format. Calculation is shown below.As you can see in example figure, eth0 is on irq 25 and eth1 is in irq 26. Let's say we want to set irq25 to be served only by cpu3. First of all, we have to calculate the value for cpu3 in hex value. Calculation is shown below.


Calculation
            Binary       Hex
 CPU 0    0001         1
 CPU 1    0010         2
 CPU 2    0100         4
+ CPU 3    1000         8
 -----------------------
 both     1111         f

Calculation is shown for 4 cpu system for simplicity. normally the value for all cpus on a system is represented by 8 digit hex value.As you can see in binary format every bit represents a cpu. We see that binary representation of cpu3 is 8 in hex. Then we write it into smp_affinity file for irq 25 as show below.

# echo 8 > /proc/irq/25/smp_affinity

You can check the setting by looking in to file content.

# cat /proc/irq/25/smp_affinity
00000008

Another example, let's say we want irq25 to handled by cpu0 and cpu1.

   CPU 0    0001         1
+  CPU 1    0010         2
--------------------------
                              0011                        3
Setting bit for cpu0 and cpu1 is giving us the value 3.For example if we need all cpus to handle our device's irq, we set every bit in our calculation and write it into smp_affinity as hex value which is F. There is an implementation in Linux called irqbalance , where a daemon distributes interrupts automatically for every cpu in the system. But in some cases this is giving bad performance where you need to stop the service and do it manually by yourself as I described above for higher performance.Also, irqbalance configuration let's you to configure where it will not balance given irqs or use specified cpus. In this case you can configure it to not touch your manually configured irqs and your preferred cpus and let it run to automatically load balance rest of the irqs and cpus for you.

суббота, 29 марта 2014 г.

How to Check your Hard Disk for Bad Blocks in Ubuntu

Взято отсюда, спасибо автору.


If your system regularly does disk checks on boot up and if it often finds errors during the check, its highly possible that you have bad sectors on your hard disk.  In such cases its highly recommended to do a disk check to detect if you have bad sectors on the disk.

What is a bad sector? A bad sector is a sector on a computer’s disk drive or flash memory that cannot be used due to permanent damage (or an OS inability to successfully access it), such as physical damage to the disk surface (or sometimes sectors being stuck in a magnetic or digital state that cannot be reversed) or failed flash memory transistors.
The simplest way to do a check on a disk is to use the badblocks command from the command line in Linux. To check a disk partition(in this case /dev/sdc1) use the following command:
 sudo badblocks -v /dev/sdc1
 The output will be in the following format:
sudo badblocks -v /dev/sdc1
Checking blocks 0 to 130954239
Checking for bad blocks (read-only test): 5621828 done, 3:37 elapsed
5621860 done, 8:43 elapsed
5621861 done, 13:25 elapsed
5621862 done, 17:57 elapsed
done
Pass completed, 4 bad blocks found.
If you find bad sectors it usually means its time to change your disk. The situation will most probably get worse over time, but there is a tiny possibility that these are false positives(mostly coz of problems elsewhere in the system). The alternative option is to mark these blocks as badblocks and tell your system to not write any data there. This will surely buy you some more life of the disk.
Note: The second option is cheaper(takes a bit of time though) and effective way of finding over time if your disk really had errors but if your data is very important to you, please back it up elsewhere or you risk losing it.
First we have to write the location of the bad sectors into a file.
 sudo badblocks /dev/sdc > /home/hacks/bad-blocks
After that, we need to feed the file into the FSCK command to mark these bad sectors as ‘unusable’ sectors.
 sudo fsck -l bad-blocks /dev/sdc

среда, 19 марта 2014 г.

Ovirt и его странности.

Ovirt и его странности.

1)Backup/Restore

1.1 До 3.6
Бэкап базы - скриптом engine-backup --mode=backup

1.2 В версии 3.6 пример:

engine-backup --mode=backup --scope=all --archive-compressor=gzip --files-compressor=None --db-compressor=None --file=<filename>.tgz --log=<filename>.txt

На выходе будет tgz архив, внутри которого БД в формате custom и свернутые tar каталоги с конфигами, сертификатами и прочим.


1.3 Полная процедура восстановления:
- поставить ovirt-engine
- сделать engine-setup, ответы на вопросы давать такие же, как и при начальной установке
- остановить engine
- drop database engine / create database engine owner <your db owner>
- engine-backup --mode=restore --scope=all --file=<your archive file> --log=<log file> --change-db-credentials --db-host=<your host> --db-user=<your db user> --db-name=<your db name> --db-password='<you can watch it in /etc/ovirt-engine/engine.conf.d/10-setup-database.conf>' --restore-permissions
- внести изменения в  /etc/ovirt-engine/aaa/internal.properties
- запустить engine


2) Upgrade 3.4 to 3.5

2.1) Small bug when upgrading existing data centers from oVirt 3.4 to 3.5.

There is a bug filed on this that will likely be resolved in the next minor version upgrade. As a work-around, navigate to the Storage tab, click your VM’s storage domain, in the sub-tab at the bottom click Storage Profile, and add a default storage profile for your VM’s storage.
You will now be able to add disks to VMs again.

3) Инсталляция сервера - хоста.

  3.1) Перед запуском инсталляции присоединить репозиторий ovirt
yum localinstall  http://resources.ovirt.org/pub/yum-repo/ovirt-release41.rpm
  3.2) Если устанавливаем на сервер, загружающийся с usb flash, то необходимо провести следующие манипуляции:
     3.2.1) /var/log монтируем как tmpfs
     3.2.2) Настроить rsyslog на отправку сообщений на сервер централизованного сбора журналов работы
     3.2.3) при каждой загрузке сервера необходимо создавать структуру каталогов в /var/log, для этого:

cat /usr/local/bin/create-log-dirs.sh
#!/bin/bash
        /usr/bin/mkdir -p /var/log/zabbix
        /usr/bin/chown zabbix:zabbix /var/log/zabbix


/usr/bin/mkdir -p /var/log/vdsm                                                                                                                                                                                                               
/usr/bin/mkdir -p /var/log/libvirt                                                                                                                                                                                                            
/usr/bin/mkdir -p /var/log/glusterfs
/usr/bin/touch /var/log/vdsm/upgrade.log                                                                                                                                                                                                      
/usr/bin/chown -R vdsm:kvm /var/log/vdsm

cat /etc/systemd/system/create-log-dirs.service
[Unit]
After=local-fs.target

[Service]
ExecStart=/usr/local/bin/create-log-dirs.sh

[Install]
WantedBy=default.target

  systemctl daemon-reload
 

4)  USB устройства хоста внутри VM

На сегодня все должно работать из GUI ovirt-engine, немного про глюки: http://lists.ovirt.org/pipermail/users/2015-December/036808.html

На всякий случай то, что было до 3.6: 

Теория:

http://www.ovirt.org/VDSM-Hooks/hostusb

Практика (взято отсюда):
First you have to install hostusb hook on the host machine. Then you have to run this command  
on engine to define 'hostusb' as a custom 
property:

sudo engine-config -s UserDefinedVMProperties='hostusb=[\w:&]+'

You can check out using:

sudo engine-config -g UserDefinedVMProperties. 
 Then editing your virtual machine, go to the 'Custom Properties' tab. There, select 
'hostusb' and in the right textbox, put the id. Example: 0x1234:0xbeef.

You can define several ids, putting '&' between them: 
0x1234:0xbeef&0x2222:0xabaa.
5) Memory page sharing management on hosts.
KSM From RH RHEV documentation




6) Утечка памяти в VDSMD у Ovirt 3.5

На данный момент не решена проблема (обновился до 3.5.2) с утечкой памяти в vdsmd, что в общем-то признано разработчиками, есть многостраничный "плач ярославны" на багтрекере ovirt, но воз и ныне там.
Поэтому обходимся пока рестартом vdsmd по cron 2 раза в неделю.

7)Привязка сетевого интерфейса ovirt к fake network, например tun interface.

Источник
I suspect the problem is that VDSM isn't told by default to pass dummy
NICs on to the engine. Have a look at the following file (exact path
might vary):

/usr/lib64/python2.7/site-packages/vdsm/config.py

Search for a variable called "fake_nics", it should exist for 3.2. By
default it is empty, but you may define a pattern for fake NIC names to
be passed to the engine (in your case probably "dummy*"). Restart the
vdsm daemon and everything should be fine. Maybe you'll have to move the
host to maintenance and then reactivate just to refresh the UI.

8) reset/change admin password.

ovirt-aaa-jdbc-tool user password-reset admin –password-valid-to=’yyyy-MM-dd hh:mm:ssZ’

9) Remove DWH


 
remove related rpms
 
# su - postgres
psql
drop database ovirt_engine_history;
drop database ovirt_engine_reports;
 
remove the folders /var/lib/ovirt-engine-reports,
/var/lib/ovirt-engine-dwh, /etc/ovirt-engine-reports and
/etc/ovirt-engine-dwh manually. Then engine-backup works again.
Otherwise nothing has struck me. 


10) Create a windows template

11) Ошибки и способы их исправления.


11.1 "Host (hostname) is installed with VDSM version (4.14) and cannot join cluster (clustername) which is compatible with VDSM versions [4.13, 4.9, > 4.11, 4.12, 4.10]."
yum clean all, yum update ovirt-release, yum update
(выполнить на ovirt-engine)

11.2 "Could not retrieve mirrorlist http://www.jpackage.org/mirrorlist.php?dist=generic&type=free&release=6.0 error was
14: PYCURL ERROR 22 - "The requested URL returned error: 504 Gateway Time-out"
Error: Cannot find a valid baseurl for repo: ovirt-jpackage-6.0-generic"
[root@colove01 yum.repos.d]# cat /tmp/jp.txt
# No local mirror detected - defaulting to adding them all
http://sunsite.informatik.rwth-aachen.de/ftp/pub/Linux/jpackage/6.0/generic/free

redirect /etc/yum.conf.d files to /tmp/jp.txt:
...
#mirrorlist=http://www.jpackage.org/mirrorlist.php?dist=generic&type=free&release=6.0
mirrorlist=file:///tmp/jp.txt
источник: http://www.mail-archive.com/users@ovirt.org/msg18008.html


11) Debian ovirt-guest-agent.

apt-get install ovirt-guest-agent
chmod +x /usr/share/ovirt-guest-agent/ovirt-guest-agent.py
systemctl restart ovirt-guest-agent 
проверить права на каталог /var/log/ovirt-guest-agent

четверг, 13 февраля 2014 г.

VPLS и AToM, участвуют Cisco ASR9001 и ASR1002


   Итак задача: организовать L2 туннель между сабинтерфейсами указанных устройств (начиная с версии IOS XE 3.10 на ASR 1000 разрешили-таки service-instance на Port-Channel интерфейсах и тогда я бы рекомендовал делать нормальный VPLS, в 3.13.2 даже пофиксили странное поведение service instants в определенных условиях).

   Очевидно, что L3 connectivity между устройствами (и loopback интерфейсами) должно присутствовать.

1. AToM.

 Конфигурация ASR9001

interface Bundle-Ether1
 description *** Uplink ***
 mtu 9000
interface Bundle-Ether1.2903
 description *** to ASR1002 ***
 ipv4 address 192.168.1.1 255.255.255.254
 encapsulation dot1q 2903
 !
 interface Bundle-Ether1.396 l2transport
  description *** test subinterface for AToM ***
  encapsulation dot1q 396
  rewrite ingress tag pop 1 symmetric <------ обратите внимание!!
  mtu 1564
 !
 mpls ldp
  router-id 192.168.10.1
  discovery targeted-hello accept
  log
  neighbor
  !
  interface Bundle-Ether1.2903
 !
 l2vpn
   logging
   pseudowire
  !
  pw-class test-atom
   encapsulation mpls
    protocol ldp
    control-word
  !
  xconnect group Test-Atom_group
   p2p asr1002
    interface Bundle-Ether1.396
    neighbor ipv4 192.168.10.2 pw-id 396
    pw-class test-atom
    !
   !
  !
   Конфигурация ASR1002
mpls label protocol ldp
mpls ldp discovery targeted-hello accept
mpls ldp router-id Loopback0 force
!
pseudowire-class test-atom
  encapsulation mpls
  control-word
!
interface Port-channel1
  description *** Uplink ***
  mtu 1546
  no ip address
    no negotiation auto
!
interface Port-channel1.2903
  description *** to ASR9001 ***
  encapsulation dot1Q 2903
  ip address 192.168.1.2 255.255.255.254
  no ip unreachables
  ip flow ingress
  ip ospf network point-to-point
  ip ospf mtu-ignore
  ip ospf cost 2
  mpls ip
  mpls label protocol ldp
  mpls mtu 1536
!
interface Port-channel1.396
  description *** test subinterface for AToM ***
  encapsulation dot1Q 396
  xconnect 192.168.10.1 396 encapsulation mpls pw-class test-atom
!

   Проверка работоспособности:
ASR1000: show mpls l2transport vc detail
ASR9001: show l2vpn xconnect detail

2. VPLS. 

Настройки интерфейсов подключения устройств и базовые настройки MPLS те же.
Вообще, VPLS в данной связке "взлетает" согласно официальным мануалам, один нюанс: после конфигурации первого VPLS на ASR1000 с ПО версии 3.10.2 устройство необходимо перезагрузить, иначе появляются фантомные глюки из серии "только что трафик ходил и вдруг перестал на некоторое время", при этом никаких ошибок сигнализации VPLS нигде не видно


Конфигурация ASR9001
interface Bundle-Ether1.51 l2transport
  encapsulation dot1q 325

!
l2vpn
  logging
  pseudowire
 !
 pw-class core-vpls
  encapsulation mpls
  protocol ldp
  !
 bridge group test-group
  bridge-domain 51
   interface Bundle-Ether1.51
   !
   vfi Inet_corp1
    neighbor 192.168.10.2 pw-id 51
    pw-class core-vpls
Конфигурация ASR1002
l2vpn vfi context Inet_corp1
  vpn id 51
  member 192.168.10.1 encapsulation mpls
!
bridge-domain 51
 member Port-channel1 service-instance 51
 member vfi Inet_corp1
!
interface Port-channel2
  description *** test ***
  mtu 1546
  no ip address
  no negotiation auto
  service instance 51 ethernet
   encapsulation dot1q 325
 !

Проверка работоспособности:
ASR1000: show mpls l2transport vc detail
ASR9001: show l2vpn bridge-domain bd-name 51 detail