Page tree
Skip to end of metadata
Go to start of metadata

Doctor Team Meetings

Meeting Info:

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/ and below


December 13th

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-12-13-14.00.html

  • PoC with limited checkpoints => should be OK for Danube.  Checkpoints on component boundary only, calling for contributor who has expertise in bash programming and familiar with existing modules
  • Integration with verification job => should be OK for Danube with PoC
  • Display the profiling result in console => should be OK for Danube with PoC
  • Report the profiling result to test database => will be in `doctor.profiler` module, not sure for Danube
  • Independent package which can be installed to specified environment => will be in `doctor.profiler`, not sure for Danube
  • Seperate CI job?

December 6th

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-12-06-14.02.html

  • CI job status
  • reviewers, check doctor.log carefully as the jobs are false positive
  • it's running with mitaka openstack (with newton congress), all CI jobs are running on the same deployment without any recreation
  • CI concerns: which version should we use?
  • apex master + functest master? --> not stable, builds per commit?
  • apex/functest checkpoints? --> who will take care?

October 4, 2016

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-10-04-13.04.html

  • SFQM D planning report (by Maryam)
    • Collectd plugins:
      • OVS stats + Events: interface stats + Link status changes and if the OVS daemon dies on the server.
      • RAS events - Notification of Machine Check Errors
      • RDT stats - Last Level Cache Occupancy, Instructions per clock and Memory BW utilization stats
      • BIOS - report vendor, version, and revision.
      • Legacy - report thermals, voltages and fan speed metrics.
      • SNMP collector - map any of the existing collectd metrics to MIBs and reply to SNMP walks/get/get next (Traps will not be supported)
    • Fuel enhancements to collectd plugin for SFQM today.
    • Functest for the plugins listed above.

June 21, 2016 - OPNFV Summit Berlin

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-06-21-08.21.html

  • 10:15-10:45 @ Tegel Room
  • Maintenance / round table.
  • Current state of proposed spec
  • 11:00-11:30 @ Tegel Room
  • Integration & Testing Improvements
  • In this session, we'll check out details of tasks for Colorado, as well as further improvement plans
  • 11:30-12:00 @ Tegel Room
  • Expectations and Next steps for Doctor for the next 6 months (D-release goals)
  • 15:00- 
  • Demo story readout & dry run
  • Doctor+Congress
  • Doctor+Vitrage
  • AP
  • AP1: discuss with Fatih, Jack Morgan about the sceraio to join and requirement to CI

June 14, 2016

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-06-14-13.03.html

  • presentation plans at OPNFV summit
  • Failure Inspection in Doctor utilizing Vitrage and Congress (Ryota, Ohad, Masa)
  • Doctor tech deep dive (Tomi, Ryota)
  • Advancing Upstream Collaboration with OpenStack (Tomi)
  • Monday 14:40-15:10
  • Doctor and SFQM Frameworks Unite: Intelligently Monitoring the NFV Infrastructure
  • Service Assurance @ DPDK/FD.io Mini-Summit (Tuesday)
  • check of demo status
  • Doctor+Congress
  • Both NIC down rule
  • execute[neutronv2:force_down_port(portid)] :- neutronv2:ports(id=portid, hostid=hostname, vif_type=viftype), 
  •                                                                                    doctor:events(hostname=hostname,vif_type=viftype,type="host.nic1.down"), 
  •                                                                                    doctor:events(hostname=hostname,vif_type=viftype,type="host.nic2.down")
  • One NIC down rule
  • execute[neutronv2:force_down_port(portid)] :- neutronv2:ports(id=portid, hostid=hostname, vif_type=viftype),
  •                                                                                    doctor:events(hostname=hostname,vif_type=viftype,type="host.nic1.down")
  • execute[neutronv2:force_down_port(portid)] :- neutronv2:ports(id=portid, hostid=hostname, vif_type=viftype),
  •                                                                                    doctor:events(hostname=hostname,vif_type=viftype,type="host.nic2.down")
  • Doctor+Vitrage
  • PoC Filming Wednesday 09:00-09:30
  • Design session topics
  • C release tasks
  • AoB

June 7, 2016

Agenda:

  • PoC status
  • "Congress-demo": status; story
  • "Vitrage demo": status, story
  • Design session topics
  • C release tasks
  • AoB
  • Request to share overview of presentations in OPNFV summit

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-06-07-13.00.html  

May 31, 2016

May 24, 2016

Agenda:

  • C release tasks
  • AoB
  • Presentation: Advancing Upstream Collaboration with OpenStack @ OPNFV Summit 11:20am, 22 June

May 17, 2016

Agenda:

  • OPNFV summit
  • Booth
  • Move-in
  • Tuesday, June 21 | 12:00 – 15:00
  • • PoC zone Hours
  • • Wednesday, June 22 |8:00- 9:00 & 10:30 – 17:00
  • • Thursday, June 23 | 8:00- 9:00 & 10:30 – 15:30
  • • Move-Out
  • • Thursday, June 23 | 15:30 – 17:30
  • Design Summit Project Breakouts and Hacking
  • C release task bashing
  • AOB
  • Some words about the maintenance if time

May 10, 2016

Agenda:

  • C release activities and upcoming milestone
  • Carlos (I'll not be joining today's meeting): created and linked/mark as dependency couple of JIRA issues related to Aodh and Congress support in multiple installers.
  • Plan is to reach out to installer PTLs next week if no ack or progress
  • OPNFV Summit SP PoC
  • Carlos: no updates from OPNFV events team
  • Carlos: to all involved please prepare some slides in 1-2 weeks with e.g. number and dimensions of servers and switches, use cases being showcased, demo displaying mock up, etc.
  • Maintenance in Nova
  • Get servers filtered by host status BP will probably by abandoned.
  • No efficient way of doing filtering in Nova, so it can be left outside.
  • AOB
  • OPNFV Summit Sessions

May 3, 2016

Agenda:

  • Report from OpenStack Summit
  • C release activities and upcoming milestone
  • OPNFV summit - SP PoCs

April 27, 2016

April 19, 2016

Agenda:

  • OpenStack Summit
  • Ops: Nova Maintenance - how do you do it?
  • Presentation readiness
  • Congress + Doctor (see slides)
  • Intel (SFQM) + Doctor
  • Session timeline (and high-level outline):
  • 15 min: [Iris] opening by CloudBand: introduction of topic and speakers, NFV industry transformation to open source, Nokia involvement in open source until project Vitrage
  • 10 min: [Gerald] handover to Doctor: overview of Doctor project, focusing on Inspector including requirements, etc.
  • 10 min: [Ohad] Vitrage: short introduction of Vitrage project: scope, basic architecture, use cases. Then, focus on how Vitrage do fault management for NFV and how it implements the Inspector.
  • 5 min: if leftover time – time for questions.
  • C release actions
  • Contributors/Committers List Update
  • JIRA tickets
  • OPNFV Summit
  • PoC Booth(s)

April 12, 2016

Agenda:

  • Colorado release planning
  • OPNFV PoCs
  • Maintenance
  • No changes to Nova?
  • AOB
  • Doctor presentations for OpenStack Summit - proposal to present draft slideset for OpenStack in next weeks Doctor meeting

April 5, 2016

Agenda:

  • Colorado release planning
  • Maintenance use cases
  • OpenStack Summit Demos
  • Tests with Models project
  • AOB
  • can presenters update the Etherpad page to only show the talks that had been submitted

March 29, 2016

Agenda:

  • SFQM - collectd update (Maryam)
  • Discussion on OPNFV Summit proposals (Carlos)
  • Maintenance spec (Tomi)
  • AOB
  • New Wiki
  • TODOs: JIRA works

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-03-29-13.00.html

March 22, 2016

Agenda:

  • IMPORTANT NOTE: As we have weekly Doctor meetings fixed at 06:00 PacificTime, today's meeting will start at 14:00 (CEST) instead of 15:00 becausePacific went Daylight Saving Time already last week. Daylight Saving Timebegins on March 27 in Europe.  The change to 14:00 only applies to today'smeeting! 
  • Hackfest Quick Reports (Carlos)
  • Action Ryota to open and cleanup JIRA tickets according to the result of the discussion at the hackfest
  • Maintenance changes to Nova (Tomi)
  • Propose a spec as normal (Tomi: can present the main idea in this meeting).
  • Had some chat with John Garbutt (Nova plt) and he is also very interested about the topic and looking to see spec and it would be natural continue for get-valid-server-state work. Also proposed I could discuss this with API team.
  • Initial proposal would be to have start and end time visible. API to set that and visibility to VM owner also. Currently only disable/enable to stop schduling, but this would be for the real maintanance period.
  • Any actions for VMs would be harder and out of scope. Also talked a bit about this with Sean Dague (core). Mostly opinion has been to have this external, but gaps surely needs to implement in Nova if use case needs it. Some discussion exists about auro recovery, but perhaps addressed later.
  • AoB
  • OpenStack Summit
  • Reminder: multiple Doctor members got presentations accepted -- thank you all! This is no news but good to share once again :)
  • OPNFV Summit
  • Reminder: CFP deadline March 31

March 8, 2016

Agenda:

  • Vitrage BPs to be added
  • Maintenance
  • Now there is notification and Nova API support, maybe should define the rest (hackfest)
  • OPNFV meetings:
  • Plugfest?
  • Summit? / design sessions / presentations
  • AOB
  • next meeting? skip? - restart from Mar 22
  • Intent to participate in C release
  • OpenStack Summit

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-03-08-13.59.html

March 1, 2016

Agenda:

  • Congrats! Doctor is in the first cut of Brahmaputra!
    • we need fix on functest dashboard
  • Hackfest @ ONS: Doctor Session Planning
    • OpenStack Austin Summit prep - How to approach OpenStack with blueprints? (Ildiko)
      • Nova, Neutron, Cinder
    • Vitrage discussion. What is released in Austin, what would be expected next.
      • Doctor SB API
    • Integration & Testing
    • Maintenance?
    • Hacking session (integrating SFQM+Doctor) 2-3 hours?
  • AOB
    • Open to proposals: use JIRA, gerrit, etherpad, ML as you like

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-03-01-14.01.html

Feb 23, 2016

Agenda:

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-02-23-14.00.html

Feb 16, 2016

Agenda:

  • OPNFV B release readiness
    • We still need our test scenario run successfully, but waiting apex jobs in functest run successfully first.
    • Release note
  • Vitrage discussion
  • C release features
  • AOB
    • OpenStack Summit CFS Vote

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-02-16-13.59.html

Feb 9, 2016

Agenda:

Minutes: http://ircbot.wl.linuxfoundation.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-02-09-14.00.html

Feb 2, 2016

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-02-02-14.00.html

Jan 26, 2016

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-01-26-14.01.html

Jan 19, 2016

Agenda:

  • OPNFV B release Readiness
    • Deployment tools code freeze
    • Testing tools and test cases code freeze
      • Status: Done in Doctor, functest integration done (Jan 15), but no test
    • Scenarios operational (deploy/build/test defined and can be triggered from jenkins automatically)
      • Status: No, but can be started quickly once deployment tool is readay (test scenario can be triggered by doctor script change)
    • All documents ready for review. No new content, only corrections after this date.
      • Status: platform overview is ready, but config guide and use manual are still missing
  • OpenStack BPs
  • AOB

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-01-19-14.00.html

Jan 12, 2016

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-01-12-14.00.html

Jan 5, 2016

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2016/opnfv-doctor.2016-01-05-14.01.html

Dec 22, 2015

Agenda:

  • Milestone E forcus
    • test case/script first by this Friday
    • then enable jenkins jobs and prepare manuals
  • JIRA Issues
  • AOB

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-12-22-14.00.html

Dec 15, 2015

Agenda:

  • JIRA Issues
  • AOB
    • Next Meeting?
      • Dec 22 meeting
      • Dec 29 skip
      • Jan 5 meeting

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-12-15-14.02.html

Dec 8, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-12-08-14.00.html

Dec 1, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-12-01-14.00.html

Nov 24, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-11-24-13.59.html

Nov 17, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-11-17-13.59.html

Nov 9-10, 2015

OPNFV design summit sessions

Nov 9th, 15:15-14:00 at Room #3:

Nov 10th, 11:00-12:00 at Room #2:

Nov 10th, 14:00-15:00 at Room #2:

Nov 3, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-11-03-13.59.html

Oct 28, 2015

Meeting Info:

  • Oct 28 (Wednesday) F2F meeting in OpenStack Summit Tokyo
  • When: 9-11am
  • Where: Community Lounge, 1F, #3 International Convention Center Pamir

Minutes:

  • Nova BPs
  • entities are "service" (representing nova-compute), "service" (VM, Instance) and physical machine
  • No notication when service-mark-down and vm-rest-state (reset state already exists in Nova)
  • Related BPs:
    • get valid VM state - may have discussion on friday
    • notificatin when force-down or disabled - no session... - should be in Mitaka
    • notification when server state has been reset - we have to draft and find asignee
    • add new parms in reset server state API
    • versioned nortification - 2:40pm - 3:20pm on Thursday <--- blocker for balaz's BP
    • make mark-service-down to allow mark affected VMs down as option of that API
  • 3 levels of computing failure:
    • compute node is down
    • nova-compute service is down
    • VM running on compute node is down
  • if nova-compute service goes down, VNF will likely continue running without any service disruption. Thus, VIM may not need to notify VNFM as the VNF instance is not being affected
    • Although the VNFM may not need to be notified, we need to notify the NFVO.
    • In this case in specific, there's a clear seperation of Consumer: VNFM and NFVO(/VIM administrator)
  • auto-reaction
    • vim should not execute auto reaction based on policy in server metadata specified by user, or it should?
    • use case should be discussed and described in doctor document
  • Neutron BP
    • Assignee: Carlos (NEC)
    • we need to clarify what we do in Nova, and discuss use cases and requirements in Doctor first
  • Cinder BP
    • Assignee: ZTE team
  • Alarming / Aodh etc.
    • Monasca, Congress, Vitrage
    • Doctor + Congress meeting, Wed 28 Oct 15:40-16:20, room?)
  • OPNFV Summit
    • Presentation
    • target audience are devs
    • 10mis for use cases, 10 mins for upstream activities including plans / project status
  • PoC Demo at docomo booth
    • Continuing preparing handouts
  • Design summit topics (design summit running Mon-Tue, 9-10 Nov)
    • agenda should be fixed in the next week (by Nov 6)
      • Congress
      • Nova BP
      • test cases/tool

Oct 20, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-10-20-12.59.html

Oct 13, 2015

Agenda:

  • fastpath metrics update
    • Extended Error statistics API in DPDK - patches submitted:
      • Please see http://dpdk.org/dev/patchwork/project/dpdk/list/?submitter=317
        • [dpdk-dev,2/2] igb: fix VF statistic wraparound handling
        • [dpdk-dev,1/2] ixgbe: fix VF statistic wraparound handling macro
        • [dpdk-dev] ixgbe: fix 82599 / 82598 register differences
        • [dpdk-dev,v2,11/11] fm10k: add xstats() implementation
        • [dpdk-dev,v2,10/11] i40evf: add xstats() implementation
        • [dpdk-dev,v2,09/11] i40e: add xstats() implementation
        • [dpdk-dev,v2,08/11] ixgbevf: add xstats() functions to VF
        • [dpdk-dev,v2,07/11] ixgbe: update statistic strings to scheme
        • [dpdk-dev,v2,06/11] igbvf: add xstats() implementation
        • [dpdk-dev,v2,05/11] igb: add xstats() implementation
        • [dpdk-dev,v2,04/11] virtio: add xstats() implementation
        • [dpdk-dev,v2,03/11] ethdev: update xstats_get() strings and Q handling
        • [dpdk-dev,v2,02/11] doc: add extended statistics to prog_guide
        • [dpdk-dev,v2,01/11] doc: add extended statistics notes
    • DPDK Keep Alive - patches submitted:
    • Patches are currently awaiting review of the DPDK community
    • collectd-dpdk
      • Started implementing a dpdk plugin for collectd that will retrieve the DPDK extended stats.
      • Testing of initial implementation underway.
    • collectd-ceilometer
      • Have an implementation that pushes stats directly to ceilometer from collectd.
      • working on the deployment code for devstack.
  • Maintenance
  • JIRA Issue check

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-10-13-12.59.html

Oct 6, 2015

Agenda:

  • Typical use case for OPNFV B release
  • Zabbix limitation
  • Alarm gaps
    Gap analysis: NFV Alarms in IFA006 vs Ceilometer alarms

{group3} * Maintenance

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-10-06-13.00.html

Sep 29, 2015

Agenda:

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-09-29-13.01.html

Sep 22, 2015

Agenda:

  • Sep 22
  • B release planning / status
  • Typical use case for OPNFV B release

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-09-22-13.01.html

Sep 15, 2015

Agenda:

  • Vaccine project
  • B release planning / status

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-09-15-12.57.html

Sep 8, 2015

Agenda:

  • DOCTOR-21: BGS(genesis)
  • New members

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-09-08-13.00.html

Sep 1, 2015

Agenda:

  • Brief update about Nova BPs (DOCTOR-29)
  • SFQM use cases (DOCTOR-27)
  • Neutron mark down API (DOCTOR-18)
  • Requirement Doc fix: fignum (DOCTOR-13)
  • Ceilometer event-alarm (DOCTOR-24)
  • Requirement Doc fix: poison (DOCTOR-8)
  • DOCTOR-15 move fault table to Annex
  • assign JIRA items

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-09-01-13.01.html

Aug 25, 2015

Agenda:

  • Ceilometer BP Status
  • Nova BP fixing evacuation
  • Pinpoint Intro

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-08-25-12.55.html

Aug 18, 2015

Agenda:

  • Ceilometer BP Status
  • JIRA items

Minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-08-18-13.02.html

Aug 11, 2015

Agenda:

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-08-11-12.56.html

Aug 4, 2015

Agenda

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-08-04-12.56.html

July 30, 2015

Meeting at OPNFV Hackfest: https://etherpad.opnfv.org/p/doctor_hackfest_20150730

July 28, 2015

Agenda:

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-07-28-13.01.html

July 21, 2015

Agenda:

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-07-21-12.58.html

July 14, 2015

Agenda:

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-07-14-12.56.html

July 7, 2015

Agenda:

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-07-07-13.00.html

June 30, 2015

Agenda:

  • Nova BP Status
  • Ceilometer BP Status
  • Demo status
  • Doctor session at OPNFV hackfest
  • AoB
    • Deliverables to OPNFV R2

Meeting minutes: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-06-30-12.58.html

June 23, 2015

Agenda:

  • BP Status
    • Nova
    • Ceilometer
  • Deliverable status
  • AoB
    • Ceilometer virtual mid-cycle meetup
    • OPNFV Hackfest

Minutes:

  • BP Status Nova
  • BP Status Ceilometer
    • https://review.openstack.org/#/c/172893/
    • spec got approved by 3 core reviewers
    • more developers from Intel working on it
      • in total 3 developers working on this spec
    • Alarm related activity: spin-off project from Ceilometer alarming
  • Deliverable status
    • tagged '2015.1.0'
    • working on open issues (review comments). pls join discussion in Gerrit.
    • proposal to discuss open comments in Doctor meeting
  • AoB
    • Ceilometer virtual mid-cycle
    • OPNFV Hackfest
      • at ODL summit (not OPNFV slot in ODL summit) https://etherpad.opnfv.org/p/OPNFV_at_ODL
      • #action Carlos to prepare demo script for demo
      • Ildiko and Gerald propose to have a session on Doctor, where demo is part of, but also show status of BPs/specs and way forward
    • req-doc discussion
      • fencing
        • Fencing gap had been part of earlier version of Doctor document, but was removed
        • agree mention fencing in "general features", but do not have gap on it
      • Maintenance state
        • DOCTOR-11 has wider scope. offline review needed.
        • agreement to have two maintenance states ("going-to-maintenance" and "in-maintenance")
      • User can stop maintenance
        • ...or user does not respond to maintenance request
        • error cases needed, e.g. resend maintenance request after timeout
        • how to handle cases where a user is sending NACK?
        • (need further discussion)

IRC Log:

    22:00:08 r-mibu     > #startmeeting doctor
    22:00:40 GeraldK    > #info Gerald
    22:00:46 tojuvone   > #info Tomi Juvonen
    22:02:03 r-mibu     > #startmeeting doctor
    22:02:10 r-mibu     > #endmeeting
    22:04:05 r-mibu     > #link https://etherpad.opnfv.org/p/doctor_meetings
    22:04:31 cgoncalves > #info Carlos Goncalves
    22:04:37 ildikov    > #info Ildiko Vancsa
    22:04:55 bertys     > #info Bertrand Souville
    22:07:13 r-mibu     > #topic BP Status Nova
    22:08:42 GeraldK    > #info Roman (intel) is back and will continue implementation
    22:09:35 GeraldK    > #info Carlos is facing some issues with proposed patch (conflict in gerrit)
    22:11:06 bertys     > #link https://review.openstack.org/#/c/185849/
    22:13:52 r-mibu     > #topic BP Status Ceilometer
    22:14:05 r-mibu     > #link https://review.openstack.org/#/c/172893/
    22:14:23 GeraldK    > #info spec got approved by 3 core reviewers
    22:14:43 GeraldK    > #info 2 more developers from Intel working on it
    22:15:00 GeraldK    > #info in total 3 developers working on this spec
    22:16:14 ildikov    > #link https://review.openstack.org/192684
    22:17:04 ildikov    > #link https://review.openstack.org/192688
    22:18:57 r-mibu     > #topic Deliverable status
    22:19:26 GeraldK    > #link http://artifacts.opnfv.org/
    22:19:27 r-mibu     > #info tagged '2015.1.0'
    22:22:00 GeraldK    > #info working on open issues (review comments). pls join discussion in Gerrit.
    22:26:37 GeraldK    > #info proposal to discuss open comments in Doctor meeting
    22:27:23 r-mibu     > #topic AoB
    22:27:24 GeraldK    > #topic AOB
    22:28:05 r-mibu     > #info Ceilometer *virtual* mid-cycle 
    22:28:26 GeraldK    > #info Ceilometer F2F midcylce event is canceled, there will be virtual discussion, e.g. IRC
    22:28:35 r-mibu     > #link https://etherpad.openstack.org/p/ceilometer-liberty-midcycle
    22:29:16 ildikov    > #link http://doodle.com/6vfksdu38wcwqqd3
    22:29:17 GeraldK    > #info dates are not yet fixed. doodle vote.
    22:29:34 ildikov    > #info Ceilometer has a virtual mid-cycle as opposed to face-to-face
    22:32:49 r-mibu     > #info OPNFV Hackfest
    22:33:39 GeraldK    > #action Carlos to prepare demo script for demo
    22:37:19 GeraldK    > #info Ildiko and Gerald propose to have a session on Doctor, where demo is part of, but also show status of BPs/specs and way forward
    22:38:01 r-mibu     > #topic req-doc discussion
    22:38:13 r-mibu     > #info fencing
    22:39:01 r-mibu     > #link http://artifacts.opnfv.org/doctor/html/03-architecture.html
    22:41:14 cgoncalves > #link https://gerrit.opnfv.org/gerrit/#/c/882/
    22:41:47 GeraldK    > #info Fencing gap had been part of earlier version of Doctor document, but was removed
    22:41:59 r-mibu     > #info fencing is one of external system responsibilities (when the host mark down) 
    22:42:19 GeraldK    > #info gerrit patch is proposing to add this feature to Doctor
    22:43:49 GeraldK    > #info Ildiko: there is discussion on whether Nova or external tool is responsible for fencing. should this be part of Doctor project?
    22:46:25 GeraldK    > #info Ryota: okay to not have gap on fencing, but mention this feature in the architecture section
    22:47:23 r-mibu     > #info agreed
    22:47:39 GeraldK    > #agree mention fencing in "general features", but do not have gap on it
    22:47:58 r-mibu     > #info Maintenance state
    22:48:06 r-mibu     > Change: ("going to maintenance" and "in maintenance")
    22:50:11 r-mibu     > #info DOCTOR-11
    22:51:21 GeraldK    > #info DOCTOR-11 has wider scope. offline review needed.
    22:51:55 GeraldK    > #agree agreement to have two maintenance states ("going-to-maintenance" and "in-maintenance")
    22:52:06 r-mibu     > #info User can stop maintenance
    22:53:09 GeraldK    > #info ...or user does not respond to maintenance request
    22:54:34 GeraldK    > #info error cases needed, e.g. resend maintenance request after timeout
    22:54:49 GeraldK    > #info how to handle cases where a user is sending NACK?
    22:55:09 r-mibu     > #info or having error/force policy 
    22:56:15 GeraldK    > #info give responsibility back to Administrator

{group3}

June 16, 2015

Agenda:

IRC Meeting Logs: http://meetbot.opnfv.org/meetings/opnfv-doctor/2015/opnfv-doctor.2015-06-16-13.00.html

June 9, 2015

Agenda:

  • BP Status
    • Nova
    • Ceilometer
  • Deliverable status
  • AoB
    • committer list update

IRC Meeting Logs: http://meetbot.opnfv.org/meetings/opnfv-meeting/2015/opnfv-meeting.2015-06-09-13.02.html

June 2, 2015

Agenda:

  • BP Status
    • Nova
    • Ceilometer
  • Deliverable Status

IRC Meeting Logs: http://meetbot.opnfv.org/meetings/opnfv-meeting/2015/opnfv-meeting.2015-06-02-13.02.html

May 26, 2015

Agenda:

  • OpenStack Summit report (related to Doctor)
    • Doctor Breakout Session
    • Ceilometer: event alarm
    • Nova session(s)
  • Short information from informal meeting with ETSI NFV REL, OPNFV HA and OPNFV Doctor at NFV #10 in Sanya
  • BP Status
    • Nova
    • Ceilometer
  • Deliverable
  • Promotions

IRC Meeting Logs: http://meetbot.opnfv.org/meetings/opnfv-meeting/2015/opnfv-meeting.2015-05-26-13.02.html

May 19, 2015

Canceled

May 12, 2015

Agenda:

  • Deliverable
    • Status of Deliverable.
    • Vote on document approval (i.e. declare it stable).
  • Status of BPs
  • OpenStack Summit
    • Preparation for Doctor session
    • Related summit sessiones
      • Copper session
      • May 21, 9:50-10:30, Design Summit Ceilometer "Event alarms"
  • Meeting with ETSI NFV REL at NFV #10
    • Joint work w/ ETSI NFV REL "Active Monitoring"

Participants: Gerald Kunzmann, Ryota Mibu, Carlos Goncalves, Bryan Sullivan, Adi Molkho, Dan Druta, Michael Godley, Maryam Tahhan, Tomi Juvonen, Tommy Lindgren, Gurpreet Singh

Minutes:

  • use IRC instead of Etherpad?
    • Ryota will check technical issues, e.g. using the MeetBot
  • Deliverables
    • we have not finished yet
      • further comments received today by E_// ; we also have to docx files from Dan
      • few minor syntax errors when compiling the document (in patch set 6)
      • Carlos is working with Octopus team to auto-generate HTML/PDF version of the document, but still buggy (false-positive in jenkins-ci)
      • can we get concensus on the content? would be nice to have a stable version of the document.
    • Voting can also be done via gerrit or email approval (tech-discuss list)
      • common in open source community; we can publish a stable version and then implement bugfixes afterwards
      • no objections in the call on the current version of the deliverable
  • Status of BPs
    • Ceilometer
      • https:_review.openstack.org/#/c/172893/
    • Nova
  • OpenStack Summit
    • Tomi will mainly join the Nova sessions
    • Preparation for Doctor session
      • Monday 5pm (after Promise session)
      • Ryota is preparing presentation slides based on deliverable and Prague slides
      • will provide slides by Thursday to collect comments by Doctor team; Bryan asks to upload the draft slides beforehand to discuss it as soon as possible
      • new content compared to prague (https://wiki.opnfv.org/_media/doctor/opnfv_doctor_prague_hackfest_20150224.n.pptx):
      • collaboration with other OPNFV project team and other SDOs and communities
    • Ceilometer session
    • Copper session
  • ETSI NFV at NFV#10
    • a meeting with REL is scheduled for ETSI NFV meeting in Sanya
    • Dan, Tommy, Gurpreet, Obana-san (DOCOMO) will be there
  • Joint work w/ ETSI NFV REL "active monitoring"
    • proposal for bi-weekly meeting (still under discussion)

May 5, 2015

Agenda:

  • Status of Deliverable
  • Status of BPs
  • Participation at OpenStack Summit

Minutes:

  • Status of Deliverable
  • Status of BPs
  • Participation at OpenStack Summit
  • AOB
    • Tommy: in document we state "Inspector might be based on Monasca"
      • Carlos: originally proposed by NEC; integrated with OpenStack; NEC found some "bugs" and "gaps" (e.g. delay is significantly more than 1s); meet them in Vancouver; it is a candidate, but no other platform seems to be integrated in OpenStack
      • Gerald: meeting with Fujitsu on Monasca two weeks ago
      • Carlos: pluggable architecture, could support Nagios or Zabbix
      • Gerald: in Monasca there is currently no requirement to do reporting within 1s
    • Last meeting with REL
      • has Tommy plan to meet with ETSI NFV REL? if time allows
    • ETSI NFV IFA
      • IFA documents not yet open to public

April 28, 2015

Joint meeting with ETSI NFV REL team.
Agenda:

  1. Identify Purpose of the call
    • Collaboration kick-off
  2. NFV REL:
    • Project Overview
    • NFV upgrade
    • Active monitoring and failure detection
  3. OPNFV Doctor:
    • Project Overview
    • Use cases
  4. Collaboration methodology discussion
  5. Wrap-up
 

{group3}Minutes:

  1. Purpose
    • Ryota: know each other; see how to work together; further technology discussion needed at later stage
    • Markus Schoeller (NEC): no IPR declarations today, today only exchange of public information
    • policies how to work together w.r.t IPR etc should be defined for later work
    • Gurpreet: high-level of Doctor project; fault-detection and management; what are use cases of Doctor?
  2. NFV REL introduction (Markus Schoeller)
    1. Project overview: see ETSI/NFVREL(14)000200)
      • dedicated reliability project
      • Ryota: target size / number of applications?
      • Tommy: which work items focus on VIM part? indirectly addressed in monitoring and failure detection. scalabilty per se has some impact on VIM
      • Tommy: this means "monitoring and failure detection" would be the main crossing point with Doctor? so far yes, but in next meeting new WIs may be created
    2. NFV software upgrade mechanism (Stefan Arntzen - Huawei)
      • different to traditional upgrades: "old traffic" can still go to "old software version", whereas new traffic/connections can go to the new s/w version in parallel (this is enabled by virtualization); no hard switchover needed; old system/version is still running and it can be switched back in case of issues with the new version
      • assumption is that this can be done stateless (otherwise it would be more complex)
    3. Active monitoring for NFV (Gurpreet)
      • Alistair Scott: interested in passive monitoring; where as attachment points for passive monitoring? REL has not looked in passive monitoring for NFV
      • Gurpreet: identify use cases where current implementation has gaps
  3. OPNFV Doctor
    • Stefan: plan to use OpenStack components?
    • Ryota: we are not only focusing OpenStack, but open source in general
    • Tommy: but OpenStack is the primary s/w used in OPNFV
    • Gurpreet: work flow for upstream community?
    • Ryota: define requirements, gap analysis, provide blueprints, but no coding in Doctor project
  4. Next action:
    • arrange meeting in the next NFV event
    • keep in touch

April 21, 2015

Agenda:

  1. Deliverable
    • Structure: uploaded to Gerrit and split into multiple files; need consensus from community
    • Propose requirement project deliverable template based on Doctor's (WIP: Carlos, Ryota, Ikdiko)
    • Review comments received so far
  2. Blueprints

Minutes:

  • Status of BPs
    • Nova BP
      • concept has been acccepted
      • single API to mark down nova-compute and change status of VMs
      • the scope has been narrowed, topic was modified to "mark-host-down"
    • Ceilometer BP
      • trying to have summit session regarding Ceilometer event topic
      • demo that ryota mentioned ingerrit is the same as the prague hackfest
  • Deliverable
    • We still have review comments which are not reflected to doc yet
    • RST files has been splited, the format would be template for other requirement projects
    • how we can publish ...
  • Inspector API
    • API point is OK
    • action(doctor): describe framework and inspector API
  • Logistics
    • from next week, we will start to use IRC (e.g. sharing links)
    • at #opnfv-meeting channel
  • Next meeting
    • joint meeting with NFV REL
    • action: Ryota to send out agenda

April 14, 2015

Agenda:

  1. Status of BPs
  2. Doctor requirement deliverable
  3. Doctor Southbound API

Minutes:

  1. Status of BPs
  2. Doctor requirement deliverable
    • Carlos to complete conversion by this week
    • contact Octopus on HTML conversion tool. If necessary, help Octopus talk to TSC about the positioning of Requirements deliverables
  3. API in between OpenStack and HW/NFVI monitiring module e.g. Zabbix (Ryota)
    • Change "southbound API" to API in between OStack and HW/NFVI monitoring module

April 7, 2015

Agenda:

  1. Input from Swfastpathmetrics team
  2. Status of BPs

Minutes:

  1. Input from Swfastpathmetrics team
    • current integration plan is https://collectd.org/ and also ceilometer
    • Revisit the scope of the NIC to makes sure that we can collect VF stats.
      • TBD
    • Can the NIC report VF/PF stats capabilities? Investigate: Maryam
      • I’ve been looking into this for Intel® 82599 10 GbE Controller, and this might be possible through a level of indirection by checking what VFs are enabled. It’s not exactly what’s being asked, but if you know what knew a VF was enabled then you’d know what stats are also available.
      • BTW: Stats can then be retrieved then per VF for Niantic:
        • VF Good Packets Received Count
        • VF Good Packets Transmitted Count
        • VF Good Octets Received Count Low
        • VF Good Octets Received Count High
        • VF Good Octets Transmitted Count
        • VF Good Octets Transmitted Count
        • VF Multicast Packets Received Count
      • But then error stats are still shared.
      • Open Maryam is looking into is if we knew the Queues that were assigned to a VF could we use Queue Packets Received Drop Count (QPRDC) to retrieve the dropped packets for a VF?
    • Maryam in the process of writing a DPDK app that runs as a secondary process on the host and is capable of reading the stats, which can then be parsed by a script.
      • Still in progress
    • No Southbound interface for Doctor defined yet.
    • Action: Ryota to draft SB API of Doctor
  2. Status of BPs
  3. Status of requirement deliverable
  4. Status and next steps of BPs (Tomi, Ryota)
  5. Nova BP review
  6. Input from Swfastpathmetrics team

Minutes:

  1. Status of requirement deliverable: Distributed to OPNFV community
  2. Discussion about leveraging OpenStack Zaqar as multi-tenant messaging system for real-time event notifications
  3. HTTP vs SNMP
  4. Input from Swfastpathmetrics team:
    • https://collectd.org/
    • The following set of statistics are those that are collected explicitly for the physical NIC (Intel 10G Niantic)
      • Physical Function (PF) Summed up/total ReceiveErrors/Drop Statistics Reported (note** these are per Port)
        • PF Packet drop: Sum of ( all drop registers we list)excluding error register
        • PF Packet errors: Sum of ( all error registers we list)excluding drop registers
        • PF PHY errors: Sum of (all PHY errors we list)
        • PF General errors: Sum of (all other rx/tx errors regs welist)
        • PF Missed RX: A missed packet is a packet that was correctly received by the NIC but because it was out of descriptors and internal memory, the packet had to be dropped by the NIC itself
      • Physical Function (PF) IndividualReceive Errors/Drop Statistics Reported
        • Illegal Byte Error Count: Counts the number of receive packetswith illegal bytes errors (such as there is an illegal symbol in the packet).
        • Error Byte Count: Counts the number of receive packetswith error bytes (such as there is an error symbol in the packet).
        • Rx Missed Packets Count
        • MAC Local Fault Count : Number of faults in the local MAC.
        • MAC Local Fault Count: Number of faults in the remote MAC.
        • Receive Length Error Count: Number of packets with receive length errors. A length error occurs if an incoming packet length field in the MACheader doesn't match the packet length.
        • Receive Undersize Count:. This register counts the number of received frames that are shorter than minimum size (64bytes from <Destination Address> through <CRC>, inclusively), andhad a valid CRC.
        • Receive Fragment Count: Number of receive fragment errors (frame shorter than 64 bytes from <Destination Address> through <CRC>,inclusively) that have bad CRC (this is slightly different from the ReceiveUndersize Count register)
        • Receive Oversize Count: This register counts the number of received frames that are longer than maximum size as defined by MAXFRS.MFS (from <Destination Address> through <CRC>,inclusively) and have valid CRC.
        • Receive Jabber Count: Number of receive jabber errors. This register counts the number of received packets that are greater than maximum size and have bad CRC (this is slightly different from the Receive OversizeCount register). The packets length is counted from <Destination Address>through <CRC>, inclusively.
        • Management Packets Dropped Count: Number of management packets dropped.This register counts the total number of packets received that pass the management filters and then are dropped because the management receive FIFO is full. Management packets include any packet directed to the manageability console (such as RMCP and ARP packets).
        • MAC Short Packet Discard Count: Number of MAC short packet discard packets received.
        • XSUM Error Count: Number of receive IPv4, TCP, UDP or SCTPXSUM errors.
        • FC CRC Error Count: Count the number of packets with goodEthernet CRC and bad FC CRC
        • FCoE Rx Packets Dropped Count: Number of Rx packets dropped due to lack of descriptors.
      • And the final thing we measure is packet latency through DPDK.

March 24, 2015

Participants: Ryota, Tomi, Bertrand, Gerald
Agenda:

  1. Status and next steps of BPs (Tomi, Ryota)
  2. Nova BP review
  3. Input from Swfastpathmetrics team
  4. Status of requirement deliverable
  5. Document Review

Minutes:

  • Recap of last weeks BP meeting (Thursday)
    • Ryota had presented; request was to use the template; more details needed, but general approach is okay
    • Bryan has raised issue that more discussion on notification on NB I/F interface is needed
      • important which protocol and data structure is used
    • Proposal of Doctor should be aligned with other projects
    • Ashiq: does anyone have experience in writing BPs -> Tomi: using the template it was straightforward
      • proposal to find someone with experience. Ryota already has some experience.
      • Ryota: problem is who could review our BPs. we need to socialize with community. Ryota has some channels he can use for this.
    • it is not clear if we need TSC approval for submitting the BPs; at least we should align at OPNFV level
    • Ashiq: it is important people join the meetings, e.g. the BP meeting, the individual BP meetings, socialize the BPs in the OPNFV community (by discussing it on the mailing list)
    • Tomi: proposal to send mail to tech-discuss asking for any objections within few days, then upload
      • Ryota: we need this window to also review other contributions
    • Ashiq: already upload the BPs and revise it in case needed
  • Status and next steps of BPs (Tomi, Ryota)
    • Tomi: BP following template is ready
    • Ryota: target is Thursday to have BPs ready following the template
  • Nova BP review
  • Input from Swfastpathmetrics team
    • no one from this team; skip topic for next week
  • Status of requirement deliverable
    • draft distributed yesterday.
    • ACTION: perform Doctor internal review by end of this week.
    • Stable draft will be provided to two weeks OPNFV-wide review on Monday March 30th
    • Ashiq: architecture with 4 building blocks. Inspector filters some fault information. In Notifier is a policy to also filter which fault informations to send or not.
      • why do you enable the filtering in both Inspector and Notifier? Ryota: different kind of filters. in OpenStack all alarms from Controller will be emitted, thus we need to have policy to filter.
      • Ashiq: Inspector is filtering physical faults. notifier is filtering faults on virtual resource level. Correct?
  • Document Review
  • Others:
    • join HA meetings time-by-time to monitor their progress; check status of their BPs and offer help.
      • Ryota had joined the last HA meeting: main activity is requirement document; no discussion regarding BP

March 17, 2015

Agenda:

Minutes:

  • Status of requirement deliverable
    • Updated interface specification andrelated information elements in Section 5
    • New (merged) “Figure 12 – Implementation planin Ceilometer architecture” as discussed in last call.
    • 5.1.3: Tommy pointed that Controller does not have to notify updated capacity to other moules, this is out of our scope
    • 5.2.1: we should remove step 9 and step 10 from Figure 7 - Fault management work flow
  • Status of BPs
    • Nova:
      • see slideset from 10.3.2015
      • Gap is "immediate notification" for faults over the NB I/F
      • Two alternative ways to implement it:
        • New Nova API -> might not be accepted easily
          • offline mail discussion ongoing
      • Use Pacemaker servicegroup driver -> might not address all requirements of Doctor
      • TL: Is there some possibility to ask Nova to update its state? TJ: currently doing some testing on this; also to measure the reaction times
      • TL: servicegroup is within one cluster. there is also a pacemaker remote, but might be slow
      • RM: blog post by Russell: http://blog.russellbryant.net/2015/03/10/the-different-facets-of-openstack-ha/
    • Ceilometer
      • proposal to discuss "Event-Publisher for Alarm" and "Notification-driven alarm evaluator" for discussion in Thursday tech meetings (see email from Hu, Bin)
      • Ryota is doing code review in OpenStack
        • plan is to stop new "meter" due to performance issues. idea is to have a new type of notification.
      • TL: feedback from Ceilometer team? no yet, Ryota has not introduced Doctor project in this community. plan to do after OPNFV BP review.

March 5, 2015

Agenda:

Minutes:

  • Status of Document
    • 3.1 there is unclaified maintenance usacese --> Tommy will send ETSI Doc Ph2 to us
    • 3.1.2 text is missing --> Gerald can add text
    • 3.5
    • 4 Gap analysis: are there anyrelated BPs that should be added as references? --> BP links should be added
      • Carlos will help check if there are blueprints already filled in and add to the document
    • 5 Detailed implementation plan --> all, please check this chapter
    • Fig.9 what is maintenance sequence loke like? --> TBD (tentative FB and sequence are proposed)
    • Schedule
      • 1week for Doctor internal review (-3/17)
      • 2 week for OPNFV community review (-3/31)
      • 2 week for Doctor for team work (-4/14)
      • (OPNFV release 1 4/23)
  • Status of BPs: not handled

March 5, 2015

Ad-hoc meeting for blueprint planning

Agenda:

  • Discuss BPs
  • Report on meeting with HA team

Minutes:

  • Presentation by Ryota on Ceilometer and where our BPs fit to the Ceilometer architecture (see slides)

March 3, 2015

Agenda:

  • BPs alignment
  • Mike from Intel / swfastpathmetric will join this team
  • Hijacking the doctor meeting to discuss blue-prints next week for all projects
  • Doc status

Minutes:

  • BPs alignment
    • https://etherpad.opnfv.org/p/doctor_bps
    • Bring BP topic also to TSC
    • BP should have a list of parameters/data missing
    • OpenStack BPs shall be in this format (using OpenStack terminology such that a developer can read/understand it)
    • Proposal to have a high-level description of the BPs in the Wiki
    • Ceilometer is the right place to implement such feature, although other alternatives may exist
    • TODO(Ryota): prepare slides, provide IRC available time
  • Hijacking the doctor meeting to discuss blue-prints next week for all projects:
    • we should keep 30min for discussing the requirement deliverable
  • Doc status: not handled

Feb 24, 2015

Requirement project round table @ Prague Hackfest

Participants: Ryota (NEC), Gerald (DOCOMO), Bertrand (DOCOMO), Ashiq (DOCOMO), Tomi (Nokia), Tommy (Ericsson), Carlos (NEC), Gianluca Verin (Athonet) Daniele Munaretto (Athonet), Sharon (ConteXtream), Christopher (Dorado Software), Russell (Red Hat), Frank Baudin (Qosmos), Chaoyi (Huawei), Al Morton (AT&T), Xiaolong (Orange), (Oracle), Randy Levensalor (CableLabs) ...

Slides can be found here: https://wiki.opnfv.org/_media/doctor/opnfv_doctor_prague_hackfest_20150224.n.pptx

Minutes:

  • Use case 1 "Fault management"
    • Main interest: northbound I/F
    • Reaction of VNFM is out of scope
    • VM (compute resouces) is the first focus, storage and network resources will follow at later stage
  • Fault monitoring: plugable architecture is needed to catch different (critical) faults in NFVI to enable use of different monitoring tools. Predictor (fault prediction) may also be one input.
  • 4 functional blocks:
    • controller (e.g. Nova), monitor (e.g. Nagios, Zabbix), notifier (e.g. re-use Ceilometer), inspector (fault aggregation etc)
  • VM state in resource map, e.g. "fault", "recovery", "maintenance" (more than just a heartbeat)
  • Question of whether other OpenStack components (e.g. Cinder, Glance, etc) can report events/faults
  • What is the timescale to receive such fault notification? this would be helpful for the motivation in the blueprints. Telco nodes: i.e. less than 1s, switch to ACT-SBY as soon as possible.
  • Preference is event based events, not polling. should be configurable.
  • Telco use case would have few hundreds of nodes, not thousands of nodes.
  • Demo 1 (using default Ceilometer) takes approximately 65 seconds to notify the fault (90 seconds total including spawning new VM), while demo 2 only takes <= 1 second (26 seconds total)
  • Pacemaker is running at application layer; different scope.

Feb 23, 2015

Doctor/Fastpathmetrics/HA Cross Project Meeting @ Prague Hackfest

Goal:

  • reduce conflicts between requirements projects

Minutes:

  • Project Intro:
  • Identify Overlap:
    • NB I/F
      • Doctor is also requiring fast reaction. objective with HA is similar.
      • HA has more use cases and may send more information on the northbound I/F. VNFM should be informed about changes.
      • Doctor objective is to design a NB I/F.
        • Does HA already have flows available?
        • HA is focusing on application level. Reaction should be as fast as possible. Including the VNFM may slow down the progress.
        • In Doctor we will follow the path through VNFM.
        • In ETSI we have lifecycle mgmt, where VNFM is responsible for the lifecylce
    • There are certain information the VNFM doesn't know about. In Doctor we call it "consumer".
    • Proposal to do use case analysis for HA. Which use cases may require the VNFM to be involved? "Doctors" will have a look at HA use cases.
    • How is the entity to resolve race conditions? Some entity in the "north".
    • What about a shared fault collection/detection entity instead of collecting the same information 3 times?
      • Predictor could also notify immediate failures to Doctor.
    • Security issues are not addressed in Doctor. Currently assuming a single operator, where policies ensure who can talk to who.
    • In Doctor we do not look at application faults, only NFVI faults.
    • Huawei: we use Heat to do HA. if one VM died and Heat will find Scaling Group less than 2, it will start a new VM. This may take more than 60s, we need to find something faster for HA. Heat doesn't find error in the applications.
    • Failure detection time is an issue across all projects.
    • Which metrics of fastpath would Doctor be interested in? need to check in detail. Action Item to send metrics to Doctor.
    • Hypervisor may detect failure of VM and take action.
      • Other failures: VM is using heartbeat. it will e.g. reboot after not receiving a heartbeat for 7s.
    • Doctor: if VIM takes action on its own it may conflict the ACT-SBY configuration at the consumer side. this is why the consumer should be involved.
    • Which project would address ping-pong issue that may arise?
    • We need subscription mechanism including filter (which alarms to be notified about). Mapping VM-PM-VNFM can be recorded during the instantiation.
    • Relationship between Doctor and Copper:
      • policy defines e.g. when VIM can expose its interface
      • When to inform a fault, whom to inform etc is all a kind of policy.
      • Copper has both pro-active and reactive deployment of policies. In reactive case, there may be conflict when both Copper and Doctor receive the policies.
  • Wrapup:
    • Overlap in fault management
    • FastPath: monitor traffic metrics; Doctor will need some of the metrics in the VIM. plan to do regular meetings.
    • HA: large project with wider scope than Doctor, different use cases. direct flow (to be faster). task to check each others NB I/F in order not to block each other.

Feb 17, 2015

Agenda:

  • Ashiq's proposed agenda for the Prague Hackfest next week
  • Doctor PoC Demo
  • Document Status

Minutes:

  • Participants: Ryota Mibu, Khan Ashiq, Gerald Kunzmann, Carlos Goncalves, Susana, Thinh Nguyenphu, Tommy Lindgren, Bryan Sullivan, Bertrand Souville, Michael Godley, Manuel Rebellon, Uli Kleber
  • Hackfest
    • https://etherpad.opnfv.org/p/HackfestFeb23
    • Ryota to prepare some slides on what's going to be presented in the demo by end of week
      • Carlos will help Ryota
    • Requirement projects are scheduled for Tuesday
    • Demo:
      • OpenStack Controller, Zabbix, 2-3 OpenStack compute servers to launch VMs, client to stress the system, Neutron, Nova, LB as a service, Heat, Ceilometer
      • Destroy one of the VM running a WebService
      • Key message to OpenStack? Which gap do you want to present? Why use Zabbix instead of Ceilometer (show first gap in our list)? Prepare for such questions.
  • Document status
  • HA and fault prediction project and "Software FastPath Service Quality Metric" project
    • Proposal to meet Monday afternoon after BGS project meeting
      • Carlos will contact all potential participants

Feb 10, 2015

Agenda:

Minutes:

  • OPNFV should be careful with tools projects use and distribute as part of the platform due to their licensing
  • Framework should be modular enough to be pluggable with multiple monitoring solutions
  • Editors for each first deliverable section were assigned
  • Gap analysis to be further extended
  • Section editors should have an initial draft ready by Feb 18
  • Deliverable editors (Gerald and Ashiq) will have Feb 19-20 to compile everything together for the Prague Hackfest

Feb 6, 2015

Extra meeting for Implementation Planning

Agenda & Minutes:

  • Implementation Planning
    • Topic and agreement can be found in Slides.

Feb 2, 2015

Agenda:

Minutes:

  • Participants: Carlos Goncalves, Don Clarke, Ryota Minu, Tomi Juvonen, Yifei Xue, Al Morton, Bertrand Souville, Gerald Kunzmann, Manuel Rebellon, Ojus K. Parikh, Ashiq Khan, Pasi, Paul French, Charlie Hale,
  • Ryota presents a refreshed Timeline
    • https://etherpad.opnfv.org/p/doctor
    • Initial draft of requirement document should be ready before the Hackfest 23-24 Feb in Prague
    • Ashiq asks about task allocation. See: https://etherpad.opnfv.org/p/doctor
    • Target architecture is OpenStack; Implementation plan is on how this will be realized in upstream projects, e.g. interfaces.
      • one proposal is using Zabbix. all is already there.
  • Predictor project:
    • still in proposal phase. we should keep eye on it. it has relation to Doctor
  • Implementation plan:
    • for evacuation we should stay implementation independent, not OpenDayLight or Neutron (they may use it in the actual testbed, but we should restrict Doctor to the interfaces definition)
    • it is not intended to use Ceilometer, but a similar service.
      • Agreement to use Zabbix for the GapAnalysis.
      • Doctor will have its own RestAPI as wrapper abstracting the in use monitoring solution underneath (e.g. Zabbix)
    • it is necessary to be able to isolate a faulty machine, such that new VMs are not started on this machine.
    • different ways/workflows for recovery; we should start by implementing a few sample workflows
      • e.g. switch to active hot standby VM, then instantiate a new hot standby instance (this is a Doctor requirement)
      • evacuation (if time allows) vs active hot standby (immediate action)
      • VNFM is deciding about the best action (this is out of scope of Doctor; Doctor only specifies NB I/F)
    • we need to get into more details for this plan. discussion should go via email to make progress before next meeting
  • Hackfest
    • Take to the hackfest what we have, i.e. if we "only" have one implementation plan so far let's use this.
    • https://etherpad.opnfv.org/p/HackfestFeb23
    • Doctor is planned for Tuesday. Also other requirement projects will be discussed on Tuesday.
  • Ryota did cleanup of Doctor Wiki page
  • Doctor team participation in the OpenStack Summit Vancouver?
    • related topics.
    • most important blueprints should be ready by May and could be presented there
    • Proposal: Talk on a more general topic including Doctor requirements
    • Carlos will look into it
  • Meeting time -> via email

Jan 26, 2015

Agenda:

  • Discuss maintenance use case - Tommy
  • Implementation outside Nova - Tomi

Minutes:

  • Timeline milestone planning
    • Soft schedule for Fault Table, set 1 milestone end of Jan
    • Requirement Document should be finished by Mar 15 ? - No
      • Ashiq suggested doc should be finished by end of February
    • Set some milestone on Hackfest at Prague
      • First draft
    • TODO(Ryota): create wiki page
  • Discuss maintenance use case - Tommy
  • Implementation outside Nova - Tomi
    • https://wiki.opnfv.org/_media/doctor-opnfv-proposal.pptx
    • FYI: http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal/
    • Network resources?
      • Evacuation will move the network also regardless it being OpenDaylight or Neutron.
      • We are trying to tackle step-by-step, first focusing on Nova.
      • Ceilometer approch seems to be good rather than using metadata on Nova
        • What is the relation to Nova metadata? Ceilometer is therrible for FM. It uses polling, and suits for PM. It would be extra step causing delay. It makes a lot of network traffic. Database consumes a lot of memory.
    • Should we kick poweroff the host by Doctor?
      • One needs to fence host by powering off by shutdown trough OS (or Nova) or if reachable only with IPMI, then trough that. In some case host can be rebooted as recovery, but in most cases it is faulty and needs to be moved to disabled aggrigate or mark for maintenance. If one do not reach host at all, the evacuation trough Nova will anyhow isolate host as everything will be moved to other host (network, disk).

Jan 19, 2015

Agenda:

  • Review of timeline of Doctor project
  • List of tasks

Minutes:

Jan 12, 2015

Agenda:

Minutes:

  • Fault table: https://wiki.opnfv.org/doctor/faults
    • Action: check and revise / update / extend
    • Some faults are specific to a certain HW other are more general
    • We should try to come up with a high level description of common faults
    • Proposal not to go to such level of detail.
    • Keep one fault table and use the current fault table for study of the scope of Doctor
    • Are there other faults that cannot be detected by SNMP and Zabbix_agent?
    • We need a tool in Doctor that can retrieve such alarms. Should this tool be integrated with OpenStack or be independent? Should be kept open;
  • ETSI meeting in Prague: proposal to meet there
  • Doctor "working page": https://etherpad.opnfv.org/p/doctor_gap_analysis
    • Action: edit this page for ongoing work on the gap analysis
  • Doctor wiki page updated
  • Timeplan:
    • Action: Ryota to prepare a timeplan/timeline
    • Timeplan can be checked in each week's meeting
    • Reminder: some documents should be available by March
  • Next meeting: Jan 19th

Dec 22, 2014

Agenda:

  • work item updates
  • Fault table
  • GAP analysis template
  • Wiki pages

Minutes:

  • Work item updates
    • Fault table
      • Status: waiting Palani's initial commit
      • Tomi also made initial list of faults.
      • TODO(Tomi): Open new wiki page to share the fault list
    • GAP analysis template
    • Wiki pages
      • Our plan for wiki/doc structure seems to be OK, cause there was no question and objection in the past week.
      • TODO(Ryota): Update wiki pages
  • Fault notification at the Northbound I/F
    • Critical faults
      • It was agreed that we should characterize faults as critical ornon-critical when reporting to VNFs.
      • We must report all critical faults northbound. We may report some of the non-critical faults, need further study.
    • Fault aggregation
      • Discussed whether toaggregate different alarms and faults before notifying via northbound interfaceto VNFs.
      • General agreement that there should be some level of aggregation, butneed to figure out what events need to be aggregated.
      • Some suggested that VNFs should be notified only if the faults are urgent.
    • Notifying data center operations folks about hardware faults is something that seems to be out of scope for this project. Tomi: I think they need the information and there should not be a duplicate mechanism to detect faults to be able to make HW maintenance operations. Surely they will not need the notification that we would send to VNFM, but the actual alarm information we are gathering to make those notifications. Anyhow I agree that this is not in our scope and tools like Zabbix that we could use here can easily be configured then for this also in case HW owner is interested.
    • Why should warnings be sent to VNFs (such as cpu temp rising but notcritical yet)? VNFs might want to take action such as setup/sync hot standbyand this could take some time.
  • Are there open souce projects already to detect hypervisor or host OS faults?
    • OpenStackNova devs said it should be kept simple, providers need to monitor processes ontheir own.
    • But there appears to be some open source tools(SNMP polling or SNMP agents on host). Need to pull things together.
  • Next call will be on January 12th.

Dec 15, 2014

Agenda:

Minutes:

  • wiki/doc structure
    • Agreed to have three sections
      • UseCase (High-level description)
      • Requirement (Detail description, GAP Analysis)
      • Implementation(includes monitor tools and alternatives)
  • Faults table
    • will create table that explain stories for each fault
    • columns would be physical fault, how to detect, effected virtual resource and actions to recover
    • in three categories Compute, Network and Storage, will start on Compute first
    • also try to keep separate table/categories for critical and warning
    • TODO(Palani): provide fault table example
    • TODO(Gerald): create first version of fault table after getting table example
  • framework
    • how we handle combination of faults and feature H/W faults that is still open question
    • suggestion to have fault management "framework" that should be configurable to define faults by developers or operators
  • Gap analysis
    • We should have list of items so that we can avoid duplicated work
    • TODO(Ryota): Post first item to show example how we describe that could be template for GAP analysis
  • Monitoring
    • We should check monitoring tools as well: Nagios, Ganglia, Zabbix
  • Check TODOs from the last meeting
    • seems almost all items have done or started (but we could not check 'fault management scenario based on ETSI NFV Architecture' although there is silde on wiki)
  • Next meetings
    • Dec 22, 2014
    • Jan 12, 2015 # skip Jan 5th

Dec 8, 2014

Agenda:

  • How we shape requirements
  • Day of the week and time of weekly meeting
  • Tools: etherpad, ML, IRC?
  • Project schedule, visualiztion of deliverables

Minutes:

  • How we shape requirements
    • Use case study first
    • Gap Analysis should be included existing monitoring tools like Nagios etc.
    • How we format fault message and VNFD elements for alarms?
    • Fault detection should be designed within a common/standard manner
    • Those could be implement in existing monitoring tools separated from OpenStack
    • What is "common" monitoring tools, there are different tools and configurations
    • Focus on H/W faults
    • Do we really need that kind of notification mechanism? Can we use error from API polling, just get error detected by application or auto-healing by VIM?
      • Real vEPC needs to know fault that cannot be found by application like abnormal temperature.
      • VIM should not run auto-healing for some VNF.
      • There are two cases/sequences defined in ESTI NFV MANO that fault notification are send from VIM to VNFM and to Orchestrator.
      • Alarming mechanism is good to reduce the number of request from user who pooling virtual resource status.
    • We shall categorize requirements and create new table on wiki page. (layer?)
    • -> A general view of the participants is to have the 'HW monitoring module' outside of OpenStack
    • TODOs
      • Open etherpad page for collaborative working (Ryota)
      • Collect use cases for different fault management scenarios (Ryota)
      • Set IRC (Carlos)
      • Provide Gap Analysis (Dinesh, Everyone)
      • Provide fault management scenario based on ETSI NFV Architecture (Ashiq)
      • List fault items to be detected (Ashiq, Everyone)
  • Day of the week and time of weekly meeting
    • Monday, 6:00-7:00 PT (14:00-15:00 UTC)
    • TODO(Ryota): create weekly meeting entry in GoToMeeting
  • Tools: etherpad, ML, IRC?
    • We will use opnfv-tech-discuss ML with "[doctor]" tag in a subject.
    • We will use "opnfv-doctor" IRC channel on chat.freenode.net .
    • TODO(Carlos): update wiki
  • Project schedule, visualiztion of deliverables
    • All team members are asked to check project proposal page and slides that are approved by TSC and show our schedule and deliverables.
    • Northbound I/F first specification by Dec 2014.

Dec 1, 2014

Agenda:

Minutes:

  • Project proposal
    • There were two comments at project review in TSC meeting (Nov 26)
    • Ashiq and Qiao had talked before this meeting, and agreed that we would not eliminate duplication at proposal phase
    • Project proposal was fixed by some members
      • https://wiki.opnfv.org/doctor/project_proposal
      • The project categories was changed to requirement only
      • In new revision of project proposal, we removed detailed descriptions which don't suit requirement project
      • Links to original project proposal are replaced to point the new page, and the link to the old page that described further details can be found at the bottom of the new proposal page
      • We should not edit the proposal page after TSC approval to keep evidence what we planed at the beginning of the project
      • "Auto recovery" is missing, will continue discussion in mail with clarification by Tomi

Nov 17, 2014

Agenda:

  • Scoping and Scheduling (what feature to be realized in what time frame)
  • Resources available and necessary for this project
  • Technical aspects and relevance to upstream projects
  • How to socialize with upstream projects

Minutes

  • No labels