opensubscriber
   Find in this group all groups
 
Unknown more information…

f : freebsd-questions@freebsd.org 12 August 2012 • 7:33AM -0400

ZFS stats in "top" -- ZFS performance started being crappy in spurts
by Chad Leigh - Pengar LLC

REPLY TO AUTHOR
 
REPLY TO GROUP




Hi

I have a FreeBSD 9 system with ZFS root.  It is actually a VM under Xen on a beefy piece of HW (4 core Sandy Bridge 3ghz Xeon, total HW memory 32GB -- VM has 4vcpus and 6GB RAM).  Mirrored gpart partitions.  I am looking for data integrity more than performance as long as performance is reasonable (which it has more than been the last 3 months).

The other "servers" on the same HW, the other VMs on the same, don't have this problem but are set up the same way.  There are 4 other FreeBSD VMs, one running email for a one man company and a few of his friends, as well as some static web pages and stuff for him, one runs a few low use web apps for various customers, and one runs about 30 websites with apache and nginx, mostly just static sites.  None are heavily used.  There is also one VM with linux running a couple low use FrontBase databases.   Not high use database -- low use ones.

The troubleseome VM  has been running fine for over 3 months since I installed it.    Level of use has been pretty much constant.   The server runs 4 jails on it, each dedicated to a different bit of email processing for a small number of users.   One is a secondary DNS.  One runs clamav and spamassassin.  One runs exim for incoming and outgoing mail.  One runs dovecot for imap and pop.   There is no web server or database or anything else running.

Total number of mail users on the system is approximately 50, plus or minus.  Total mail traffic is very low compared to "real" mail servers.

Earlier this week things started "freezing up".  It might last a few minutes, or it might last 1/2 hour.   Processes become unresponsive.  This can last a few minutes or much longer.  It eventually resolves itself and things are good for another 10 minutes or 3 hours until it happens again.  When it happens,  lots of processes are listed in "top" as

zfs
zio->i
zfs
tx->tx
db->db

state.   These processes only get listed in these states when there are problems.   What are these states indicative of?

Eventually things get going again, these states drop off and the system hums along.

Based on some stuff I found in Google (for a person who had a different but somewhat similar problem) I tried setting

zfs set primarycache=metadata zroot

and

zfs set primarycache=none zroot

but the problem still happened with approximately the same severity and frequency.  (Wanted to see if the system was "churning" with cache upkeep).


What is strange is that this server ran fine for 3 months straight without interruption with the same level of work.

Thanks for any hints or clues
Chad



some data points below

---

# uname -a
FreeBSD newbagend 9.0-STABLE FreeBSD 9.0-STABLE #1: Wed Mar 21 15:22:14 MDT 2012     chad@underhill:/usr/obj/usr/src/sys/UNDERHILL-XEN  amd64
#

---

# zpool status
pool: zroot
state: ONLINE
scan: scrub repaired 0 in 6h13m with 0 errors on Fri Aug 10 19:33:23 2012
config:

NAME                                            STATE     READ WRITE CKSUM
zroot                                           ONLINE       0     0     0
  mirror-0                                      ONLINE       0     0     0
    gptid/f0da8263-8a52-11e1-b3ae-aa00003efccd  ONLINE       0     0     0
    gptid/0f24ab58-8a53-11e1-b3ae-aa00003efccd  ONLINE       0     0     0

errors: No known data errors
#

---

representative data from doing a stats during a trouble period

zfs-stats  -a


------------------------------------------------------------------------
ZFS Subsystem Report Sat Aug 11 13:40:07 2012
------------------------------------------------------------------------

System Information:

Kernel Version: 900505 (osreldate)
Hardware Platform: amd64
Processor Architecture: amd64

ZFS Storage pool Version: 28
ZFS Filesystem Version: 5

FreeBSD 9.0-STABLE #1: Wed Mar 21 15:22:14 MDT 2012 chad
1:40PM  up  2:54, 3 users, load averages: 0.23, 0.19, 0.14

------------------------------------------------------------------------

System Memory:

11.49% 681.92 MiB Active, 4.03% 238.97 MiB Inact
33.37% 1.93 GiB Wired, 0.05% 3.04 MiB Cache
51.04% 2.96 GiB Free, 0.01% 808.00 KiB Gap

Real Installed: 6.00 GiB
Real Available: 99.65% 5.98 GiB
Real Managed: 96.93% 5.80 GiB

Logical Total: 6.00 GiB
Logical Used: 46.76% 2.81 GiB
Logical Free: 53.24% 3.19 GiB

Kernel Memory: 1.25 GiB
Data: 98.38% 1.23 GiB
Text: 1.62% 20.75 MiB

Kernel Memory Map: 5.68 GiB
Size: 17.27% 1003.75 MiB
Free: 82.73% 4.70 GiB

------------------------------------------------------------------------

ARC Summary: (HEALTHY)
Memory Throttle Count: 0

ARC Misc:
Deleted: 9
Recycle Misses: 64.30k
Mutex Misses: 10
Evict Skips: 58.80k

ARC Size: 39.98% 1.20 GiB
Target Size: (Adaptive) 100.00% 3.00 GiB
Min Size (Hard Limit): 12.50% 384.00 MiB
Max Size (High Water): 8:1 3.00 GiB

ARC Size Breakdown:
Recently Used Cache Size: 25.56% 785.15 MiB
Frequently Used Cache Size: 74.44% 2.23 GiB

ARC Hash Breakdown:
Elements Max: 223.30k
Elements Current: 99.93% 223.15k
Collisions: 418.23k
Chain Max: 9
Chains: 66.67k

------------------------------------------------------------------------

ARC Efficiency: 3.17m
Cache Hit Ratio: 89.07% 2.82m
Cache Miss Ratio: 10.93% 346.27k
Actual Hit Ratio: 86.49% 2.74m

Data Demand Efficiency: 99.50% 1.09m
Data Prefetch Efficiency: 60.54% 1.78k

CACHE HITS BY CACHE LIST:
  Most Recently Used: 23.72% 669.34k
  Most Frequently Used: 73.38% 2.07m
  Most Recently Used Ghost: 1.92% 54.33k
  Most Frequently Used Ghost: 3.30% 93.02k

CACHE HITS BY DATA TYPE:
  Demand Data: 38.35% 1.08m
  Prefetch Data: 0.04% 1.08k
  Demand Metadata: 58.75% 1.66m
  Prefetch Metadata: 2.87% 80.97k

CACHE MISSES BY DATA TYPE:
  Demand Data: 1.56% 5.39k
  Prefetch Data: 0.20% 704
  Demand Metadata: 55.46% 192.02k
  Prefetch Metadata: 42.78% 148.15k

------------------------------------------------------------------------

L2ARC is disabled

------------------------------------------------------------------------

File-Level Prefetch: (HEALTHY)

DMU Efficiency: 6.05m
Hit Ratio: 66.59% 4.03m
Miss Ratio: 33.41% 2.02m

Colinear: 2.02m
  Hit Ratio: 0.04% 725
  Miss Ratio: 99.96% 2.02m

Stride: 3.90m
  Hit Ratio: 99.98% 3.90m
  Miss Ratio: 0.02% 826

DMU Misc:
Reclaim: 2.02m
  Successes: 2.02% 40.86k
  Failures: 97.98% 1.98m

Streams: 125.81k
  +Resets: 0.36% 453
  -Resets: 99.64% 125.36k
  Bogus: 0

------------------------------------------------------------------------

VDEV Cache Summary: 530.68k
Hit Ratio: 15.30% 81.21k
Miss Ratio: 70.40% 373.57k
Delegations: 14.30% 75.89k

------------------------------------------------------------------------

ZFS Tunables (sysctl):
kern.maxusers                           512
vm.kmem_size                            6222712832
vm.kmem_size_scale                      1
vm.kmem_size_min                        0
vm.kmem_size_max                        329853485875
vfs.zfs.l2c_only_size                   0
vfs.zfs.mfu_ghost_data_lsize            91367424
vfs.zfs.mfu_ghost_metadata_lsize        128350208
vfs.zfs.mfu_ghost_size                  219717632
vfs.zfs.mfu_data_lsize                  132299264
vfs.zfs.mfu_metadata_lsize              20034048
vfs.zfs.mfu_size                        160949760
vfs.zfs.mru_ghost_data_lsize            45155328
vfs.zfs.mru_ghost_metadata_lsize        642998784
vfs.zfs.mru_ghost_size                  688154112
vfs.zfs.mru_data_lsize                  347115520
vfs.zfs.mru_metadata_lsize              10907136
vfs.zfs.mru_size                        794174976
vfs.zfs.anon_data_lsize                 0
vfs.zfs.anon_metadata_lsize             0
vfs.zfs.anon_size                       29469696
vfs.zfs.l2arc_norw                      1
vfs.zfs.l2arc_feed_again                1
vfs.zfs.l2arc_noprefetch                1
vfs.zfs.l2arc_feed_min_ms               200
vfs.zfs.l2arc_feed_secs                 1
vfs.zfs.l2arc_headroom                  2
vfs.zfs.l2arc_write_boost               8388608
vfs.zfs.l2arc_write_max                 8388608
vfs.zfs.arc_meta_limit                  805306368
vfs.zfs.arc_meta_used                   805310296
vfs.zfs.arc_min                         402653184
vfs.zfs.arc_max                         3221225472
vfs.zfs.dedup.prefetch                  1
vfs.zfs.mdcomp_disable                  0
vfs.zfs.write_limit_override            0
vfs.zfs.write_limit_inflated            19260174336
vfs.zfs.write_limit_max                 802507264
vfs.zfs.write_limit_min                 33554432
vfs.zfs.write_limit_shift               3
vfs.zfs.no_write_throttle               0
vfs.zfs.zfetch.array_rd_sz              1048576
vfs.zfs.zfetch.block_cap                256
vfs.zfs.zfetch.min_sec_reap             2
vfs.zfs.zfetch.max_streams              8
vfs.zfs.prefetch_disable                0
vfs.zfs.mg_alloc_failures               8
vfs.zfs.check_hostid                    1
vfs.zfs.recover                         0
vfs.zfs.txg.synctime_ms                 1000
vfs.zfs.txg.timeout                     5
vfs.zfs.scrub_limit                     10
vfs.zfs.vdev.cache.bshift               16
vfs.zfs.vdev.cache.size                 10485760
vfs.zfs.vdev.cache.max                  16384
vfs.zfs.vdev.write_gap_limit            4096
vfs.zfs.vdev.read_gap_limit             32768
vfs.zfs.vdev.aggregation_limit          131072
vfs.zfs.vdev.ramp_rate                  2
vfs.zfs.vdev.time_shift                 6
vfs.zfs.vdev.min_pending                4
vfs.zfs.vdev.max_pending                10
vfs.zfs.vdev.bio_flush_disable          0
vfs.zfs.cache_flush_disable             0
vfs.zfs.zil_replay_disable              0
vfs.zfs.zio.use_uma                     0
vfs.zfs.snapshot_list_prefetch          0
vfs.zfs.version.zpl                     5
vfs.zfs.version.spa                     28
vfs.zfs.version.acl                     1
vfs.zfs.debug                           0
vfs.zfs.super_owner                     0

------------------------


representative (from during a trouble period -- you see not much is going on -- low load and the iostat during a calm good period is about the same)

zpool iostat zroot 1


              capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----

zroot        107G  41.9G      7    261  23.8K  1.52M
zroot        107G  41.9G     10    140  7.42K   272K
zroot        107G  41.9G      8    176  14.4K   547K
zroot        107G  41.9G      0     59      0   188K
zroot        107G  41.9G      5    171  6.44K  1.73M
zroot        107G  41.9G      4    284  8.42K  1006K
zroot        107G  41.9G      5    118  2.97K   260K
zroot        107G  41.9G     25    194  27.7K   623K
zroot        107G  41.9G      0    132      0   764K
zroot        107G  41.9G      1     95  6.44K  1.16M
zroot        107G  41.9G      8    272  16.3K   829K
zroot        107G  41.9G     56    212   103K   213K
zroot        107G  41.9G     22    221  27.7K   204K
zroot        107G  41.9G      2    455  1.48K   509K
zroot        107G  41.9G     14    198  7.42K   132K
zroot        107G  41.9G     14    270  7.42K   306K
zroot        107G  41.9G      6    273  3.46K   670K
zroot        107G  41.9G     21    175  10.9K   570K
zroot        107G  41.9G     17    179  8.91K   591K
zroot        107G  41.9G     11    289  17.3K   902K
zroot        107G  41.9G     13    121  6.93K   230K
zroot        107G  41.9G     18    238  9.41K   734K
zroot        107G  41.9G     99     61  50.5K   188K
zroot        107G  41.9G      0    222      0   862K
zroot        107G  41.9G     11    149  13.4K  1.12M
zroot        107G  41.9G     15    319  10.9K  1.05M
zroot        107G  41.9G      0    127      0   392K
zroot        107G  41.9G      0    159      0  1.70M
zroot        107G  41.9G     68    196   212K   601K
zroot        107G  41.9G     17    144  18.8K   295K
zroot        107G  41.9G     12    187  17.3K   588K
zroot        107G  41.9G      0    136      0  1.23M
zroot        107G  41.9G      6    209  23.8K   564K
zroot        107G  41.9G     11    199  12.4K   422K
zroot        107G  41.9G     12    178  9.41K   553K
zroot        107G  41.9G      0    140  1.48K  1.17M
zroot        107G  41.9G     48    200   128K   411K
zroot        107G  41.9G      8    191  16.8K   121K
zroot        107G  41.9G      1    397   1013   375K
zroot        107G  41.9G      0    263      0   132K
zroot        107G  41.9G     14    228  13.4K   235K
zroot        107G  41.9G      7     21  4.46K  10.9K
zroot        107G  41.9G      2    161  1.48K   156K


_______________________________________________
freebsd-questions@free... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe@free..."

Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

Related Messages

opensubscriber is not affiliated with the authors of this message nor responsible for its content.